# Introduction

**Health Question**

I have chosen to study how much gender impacts how long hospitals stays are after a heart attack. As there are natural biological differences among males and females, there is significant research showing different reactions to medications, symptoms, how the disease develops in the body, and medical resources used (Lin et al., 2016). As we see advanced treatment techniques and pressure on health care providers from insurance companies to discharge faster, the average hospital stay has decreased to around 5 days for a treatment course without complications regardless of gender (Spencer et al., 2004). Acknowledging differences in the recovery period for men and women after a heart attack is an important consideration when determining how quickly patients are discharged home. In addition to this, studies show women face disadvantages in medical care which includes decreased survival rates for heart attacks due to different symptoms compared to men leading to more difficult diagnosis and treatment (Greenwood et al., 2018). If there are obvious differences in how a heart attack affects men and women, their course of treatment, including hospital stay, may also be different and requires further study.

# Data

Which variables are you investigating? | Length of stay (LOS) and gender |

Identify each variable as continuous/quantitative or categorical | Length of stay = quantitative Gender = categorical |

List the descriptive statistics that are used to describe that type of variable. | Sample size (n) Mean Median Central Location Mode Standard Deviation: Spread |

What does each statistic tell you about the data for the | n= sample size of the population |

given variable? | Mean= average Median= midpoint of the data Mode= most frequently occurring value Standard Deviation= How spread out the data set is |

Compute these descriptive statistics for the variables you are investigating and present them here or in a separate table below | Gender 0 n= 65, Gender 1 n= 35 Mean LOS: Gender 0= 6.32, Gender 1=7.8 Median LOS: Gender 0= 5, Gender 1= 6 Mode LOS: Gender 0= 5, Gender 1= 4 Standard Deviation LOS: Gender 0= 3.34, Gender 1= 8.92 |

**Key Features**

The statistical report was created using data provided by Hosmer, Lemeshow, and May’s data set (2008). The sample was comprised of 100 observations that were gathered from 13 separate 1-year periods starting in 1975 and ending in 2001 on patients admitted for heart attacks to hospitals in Worcester, Massachusetts (Hosmer et al., 2008). The variables being focused on in this report are length of stay (LOS) and gender of each patient as these are the key concepts in the study of gender impacts on hospital length of stay after a heart attack. By breaking down the data to compare between Gender 0 and Gender 1, we are able to compare sample size for each specific gender as well as the mean, median and mode for LOS for each gender which measure the central location within the data giving us the average or most common value (Gertsman, 2015). For Gender 0 the mean and median LOS were less than Gender 1. However, the LOS mode for Gender 1 was only 4, which was less than Gender 0, even though the mean LOS was almost 8 days. Looking at the IQR and standard deviation will demonstrate the spread, or variability of the data and indicate how well the mean, median and mode represent the data. The data reported for Gender 0 demonstrates less variability for both the IQR and standard deviation, which indicates the differences among LOS were higher for Gender 1.

# Limitations

Health issues like diabetes, peripheral vascular disease and hypertension can negatively impact outcomes after a heart attack and create additional complications in the recovery process (Mechanic & Grossman, 2020). Not having additional health information about the patients can impact LOS and cause limitations in the data set. We also need to look at the age of the data collected. Even the most recent data comes from almost 20 years ago. Since this time, there have been updates in the recognition and treatment of patients hospitalized after a heart attack. If we used more recent data, there is a possibility we would see overall shorter LOS for both genders impacting our descriptive statistical measurements. Additionally, we have no way to take into consideration insurance providers and the limitations they set on reimbursement which impacts how quickly a patient is discharged and affecting our LOS values.

# Process

Using the two-sample *t*-test will allow us to compare two separate groups, Gender 0 and Gender 1, to determine if there is a significant difference between their mean length of stay.

Using the sample sizes, mean length of stays and sample deviations for each gender we can compute the test statistic. After determining the degree of freedom based on using the smaller sample size for *n *– 1, we can determine the *P*– value and compare it to our chosen significance level (α) allowing us to conclude if there is reliable significance between the two mean length of stays, thus accepting or rejecting our null hypothesis.

Using histogram and dot plot graphs will provide a visual representation of the data we are investigating, allowing us to see the actual or perceived impact that gender has on length of stay for MI patients. This graphical data will provide us with a quick determination if there is a

difference among Gender 0 and Gender 1, while our two-sample *t*-test will allow us to determine the level of significance for any differences.

# Data Analysis

**Graphs**

Dot plot (Table 1)

Histogram Gender 0 (Table 2)

Histogram Gender 1 (Table 3)

The dot plot graph in Table 1 allows us to see Gender 0 and Gender 1 length of stays in the same graph to compare their spread and be aware of any outliers and is a quick notation of major differences between the individual samples. Here we see that Gender 0 has a wider range, while Gender 1 is tighter grouped, but with outliers. This graph was chosen because it provides a simple visual to compare both genders quickly.

With two separate histogram graphs in Table 2 and Table 3 we can more effectively interpret our quantitative variable, length of stay, for Gender 0 and Gender 1 and the frequencies. As shown in the dot plot graph in Table 1, we are able to visualize that Gender 0 has a wider range of days for LOS. The histogram was chosen because it allows us to quickly inspect data distribution and outliers.

# Test

μ_{1} : Mean of los where gender = 0 μ_{2} : Mean of los where gender = 1

μ_{1} – μ_{2} : Difference between two means H_{0} : μ_{1} – μ_{2} = 0

H_{A} : μ_{1} – μ_{2} ≠ 0

(with pooled variances)

**Hypothesis test results:**

Differenc e | Sample Diff. | Std. Err. | DF | T-Stat | P- value |

μ_{1} – μ_{2} | -1.4769231 | 1.238142 3 | 98 | – 1.1928541 | 0.2358 |

# Best Choice

Using the two-sample *t*-test will allow us to compare two separate groups, Gender 0 and Gender 1, to determine if there is a significant difference between their mean length of stay.

Using this testing method is best because we are looking at two independent samples, randomly selected, from separate populations (Gertsman, 2015). After determining the significance level, we can accept or reject our null hypothesis.

# Biostatistical Analysis

The graphical data in Table 1 shows a comparison of the mean length of stay for Gender 0 and Gender 1, allowing us to see that Gender 0 has a wider range for length of stay, but Gender 1 has more outliers. The means for both (represented by the green line) are relatively close to each other and there is not a large variation. Tables 2 and 3 show similar information surrounding the outliers for Gender 1 and similar means but provides better depiction of the difference in sample sizes for Gender 0 and 1 with Gender 0 being the larger group.

Table 4 provides us all the information from the two-sample *t*-test including our null and alternative hypothesis, allowing us to accept or reject our null hypothesis. Using statistical analysis, we can look at the determined significance level and calculated *p*-value to determine if observed differences are statistically significant (Gertsman, 2015). This hypothesis test uses a significance level of 0.05, and according to Table 4, the *p*-value is 0.2358.

# Statistical Inference Analysis

Looking at the graphs in Table 1, 2, and 3 we can visually confirm that there is some difference in length of stay between Gender 0 and Gender 1. Using our two-sample *t*-test we can determine if this difference is statistically significant or not. Looking at our calculations we see our *p*-value, 0.2358, is greater than the significance level of 0.05. The conclusion is the difference in length of stay for Gender 0 and Gender 1 is not statistically significant, retaining our null hypothesis and rejecting the alternative hypothesis.

# Conclusions

**Findings**

Looking at the graph data and Mean LOS for Gender 0 and 1 we can see there is a difference and that Gender 0 has a shorter Mean LOS, although overall there are few differences between males and females. Our findings show the importance of using statistical analysis as we are able to quickly see the differences mentioned above, but after statistical analysis we see there is no statistical significance. The findings from the research conducted help answer if gender impacts hospital length of stay after a heart attack, but highlight that further research needs to be done.

# Recommendations

Although a good start, this research has identified areas for further investigation. Using a new study that has a more balanced number of males and females will help us get a more accurate comparison as this study had 65 participants for Gender 0 and only 35 for Gender 1.

Additionally, in the future we need to account for other illnesses and disease processes that may also impact length of stay after a heart attack. Overall, the age of the data from this research emphasizes the importance of continued studies as management of a heart attack and the medical advances that have occurred since this research will impact hospitalizations for male and female patients.

References

Hosmer, D. W., Lemeshow, S., & May, S. (2008). *Applied survival analysis: Regression modeling of time to event data *(2nd ed.). John Wiley and Sons Inc.

Gerstman, B. (2015). Basic Biostatistics: Statistics for Public Health Practice.

Burlington, MA:

Jones & Bartlett Learning. Gerstman, B. (2015).

Basic Biostatistics: Statistics for Public Health Practice.

Burlington, MA:

Jones & Bartlett Learning.

Gerstman, B. (2015). Basic Biostatistics: Statistics for Public Health Practice.

Burlington, MA:

Jones & Bartlett Learning

Gerstman, B. B. (2015). *Basic Biostatistics Statistics for Public Health Practice Second Edition. *Burlington , MA: Jones & Bartlett Learning

Greenwood, B.N., Carnahan, S., & Huang, L. (2018). Patient-physician gender concordance and increased mortality among female heart attack patients. *Proceedings of the National Academy of Sciences of the United States of America, 115*(34), 8569-8574. https://doi.org/10.1073/pnas.1800097115

Lin, W. C., Ho, C. H., Tung, L. C., Ho, C. C., Chou, W., & Wang, C. H. (2016). Differences between women and men in phase I cardiac rehabilitation after acute myocardial infarction: A nationwide population-based analysis. *Medicine*, *95*(3), 1-6. https://doi.org/10.1097/MD.0000000000002494

Mechanics, O.J. & Grossman, S.A. (2020). *Acute Myocardial Infarction. *Treasure Island, FL: StatPearls Publishing

Spencer F.A., Lessard D., Gore J.M., Yarzebski J., Goldberg R.J. (2004). Declining length of hospital stay for acute myocardial infarction and postdischarge outcomes: A community- wide perspective. *Archives of Internal Medicine, 164*(7), 733–740. doi:10.1001/archinte.164.7.733

The Final Project Data Analysis consists of four milestones (attached – Feedback for all four milestones are at bottom of attachments) plus the following.

Biostatisticians are constantly called upon to analyze data in order to help researchers and health officials answer critical questions about populations’ health. For this assessment, you will imagine you are a biostatistical consultant on a small study for a local health organization. In the Assignments Guidelines and Rubrics area of the course, you will use the Data Analysis Data Set and Data Analysis Data Description, along with some background information on how and when the data was collected and the general research question the organization is interested in answering. This is often the way you will receive data in the real world. Your task is to help the organization answer their question by critically analyzing the data. You will compute your chosen statistics, interpret the results, and present the results and recommendations to non-technical decision makers in the form of a data analysis. Keep in mind that it is your job to do this from a statistical standpoint. Be sure to justify your conclusions and recommendations with appropriate statistical support. Specifically, you must address the critical elements listed below. Most of the critical elements align with a particular course outcome (shown in brackets).

I. Introduction

A. State the overall health question you have been asked to address in your own words. Be sure you capture the key elements of the question, using language that a non-technical audience can understand.

B. Assess the collected data. Use this section to layout the source, parameters, and any limitations of your data. Specifically, you should:

1. Describe the key features of your data set. Be sure to assess how these features affect your analysis.

2. Analyze the limitations of the data set you were provided and how those limitations might affect your findings. Justify your response.

C. Process: Propose how you will go about answering the health question you were asked to address based on the data set provided.

II. Data Analysis

A. Graphs: In this section, you will use graphical displays to examine the data.1. Create at least one graph that gives a sense of the potential relationship between the two variables that form your chosen healthquestion. Include the graph and discuss why you selected it as opposed to others

B. Conduct an appropriate statistical test to answer your health question.

C. Explain why this test is the best choice in this context.

D. Analysis of Biostatistics: Use this section to describe your findings from a statistical standpoint. Be sure to:

1. Present key biostatistics from the graph(s) and statistical tests and explain what they mean. Be sure to include a spreadsheet showing your work or a copy of your StatCrunch output as an appendix.

2. What statistical inferences or conclusions can you draw based on the results of your statistical test and graph? Justify your response.

III. Conclusions and Recommendations

A. How do the findings help answer your overall health question? Remember to use brief, non-technical language to ensure audience understanding.

B. Recommend areas for further research based on your findings. Remember to use brief, non-technical language to ensure audience understanding.