15 49.0138 8.38624 1 1 4000 1 https://apcdjournal.com 300 true

T test in Excel for Hospital Performance Data: Approach

Data Science Consultant , Freedman HealthCare, LLC

Mr. Ganguly is a seasoned and highly skilled consultant who brings expertise and leadership in the areas of data and database design, analysis, coding, statistical modelling, and developing analytic products to the FHC team. In his capacity as a Data Science Consultant, Mr. Ganguly has guided clients on addressing analytical and technical challenges, and provided them with meaningful, accurate, and timely analysis of health data.

Post 1:

Imagine you are in a pinch and need to run an analysis quickly and produce meaningful results. You throw your data into Excel, thinking all you need to do is enter some formulas and make it pretty. And then, after opening up with Excel and toying with your dataset, you realize this might be harder than it had appeared. You know how the story ends – frantic Google searching and a million open tabs, all part of a daunting quest for that one perfect formula.

If you’re looking for how to perform T tests in Excel, then you have come to the right spot! In this series of posts, I’ll walk through how to use a T test to assess whether hospitals perform statistically better or worse on a series of quality measures as compared to the average. In this first post, I will focus on my approach and the why behind it.

Assumptions:

I assume normal distribution and will use a two-tailed test at alpha = 0.05 to determine statistical significance; a p-value of equal to or less than 0.05 for a given hospital will indicate statistically significant difference in performance for that hospital versus the average. Given that I’ll be comparing performance data for each individual hospital against the average performance data across all the hospitals in this set, my approach will be to find the p-value of the z-score for each hospital.

Understanding the analysis:

Of course, with the T test, the metrics must be continuous variables such as median time or a score, and not dichotomous nor nominal variables such as proportions or rates. In this example, we’ll consider the measure “Median Time from ED Arrival to ED Departure for Admitted ED Patients”, belonging to the Hospital Compare data repository. Hospital Compare is managed by the Center for Medicare and Medicaid Services (CMS) and contains data that captures hospital performance on process, structure, outcome, and patient experience.

The data for this measure will therefore be measured in minutes, a continuous variable, and the objective will be determine whether minutes for each hospital in a set of hospitals are statistically different than the average minutes for the set of hospitals. Understanding the directionality of the results is important too; in this case example, a greater median time indicates inferior performance and so a lower median time is the desired outcome. For example, a hospital with a statistically lower median time compared to the average (of all the hospitals in the analysis) is a statically better performer compared to average.

 

Just as a note- the data that I’ll use is made up, and used solely for the purpose of the example.

In the next post, I’ll walk through how to set up your Excel spread sheet to run test.

Post 2

Post 3

Previous
Act now before losing precious website data
Next
T test in Excel for Hospital Performance Data: Running the T Test