Robert's Technical Ramblings

WPRDC Project - Smoking and Anxiety Meds in Allegheny County

By Robert Blake

January 30, 2019

My topic for the Dataset project was the relationships between smoking and the use of anxiety meds in Allegheny county. Throughout my life, I've never really understood the appeal of smoking, and part of my interest in this project has been an attempt at understanding why someone would smoke. I also have a vested interest in mental health, and felt that this might provide a topic to help educate people on. 

Before I discuss my study in further detail, allow me to explain what occured that resulted in my changing topics from my original "Smoking and Economic Status" project goal. While working through the two data sets I had, I soon discovered that the income data set I had was incompatible with the smoking data set. This issue stems from the formatting of the data and the levels of information given. The smoking data covers a tract-by-tract set of information for Allegheny county, whereas the census income data provided general information for each county in Western PA, but lacked the specific tract-by-tract view I required in order to compare the sets of data for my research. 

So, upon further searching, I discovered the anxiety data set and found that it largely matches the required data specificity. So, I decided to look for relationships between the two. The first finding I had was the immediately visible relationships between the maps present on each page. They were both heavily similar, giving me the impression that there was likely at least some form of connection between the two.

I then moved to attempting to match the data in tableau. This process was taxing because, as I eventually came to realize—the medication data was in number of people, while the smoking data was in percent of population. This skewed my results on the graph I was eventually able to make on Tableau, but still provided good insight. Initially, I felt that such a difference may not matter, but upon the creation of the actual graph itself, the tightly held relation between both pieces of data felt less connected that I had originally expected. Because of the difference in data, I felt the bar graphs being next to each other but having separate data scales was a useful solution to the visualization problem. 

After making my graphs, I began to look through medical research on connections between the two, and it was not long before I came across several scholarly articles on the subject. Each journal I came across suggested that my intital relationship conclusions were correct, and I was able to read through several of the articles in greater detail than just the abstract. 

It was through the readings that the topic of self-medication came up, which served as a point interest for me, and a possible area of future study. Were I to do such a study, I would likely look for data that is from exactly the same or at least closer time frames, and perhaps data on income and age groups in the tracts, as these would allow closer analysis of my various hypotheses on the subject matter.


The report on the study