#data #healthcare #python #group | ๐Ÿ—“๏ธ | May 2024 - Jun 2024 | | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Team. | Hyseung Kim, Jisu Yoon, Youjee Oh | | Credit. | -Hyseung: Research, Crawling, Data analysis, Presentation<br>-Jusu: Research, Crawling, Data analysis, Statistics<br>-**Youjee: Research, Crawling, Data analysis, Statistics, Data visualizing** | | About. | The purpose of the project was to objectively identify problems based on data, rather than relying on the designer's intuition. It was carried out in data-based service design class, where we investigated mental healthcare using python to crawling youtube text data. | The number of people with severe mental illness in South Korea increased from 143,000 in 2013 to 175,000 in 2019, with an average annual growth rate of 3.4%. The age at which severe mental illnesses are diagnosed is decreasing, and bipolar disorder has seen a 10.2% increase compared to other disorders. Globally, psychological distress among adolescents has particularly risen. At least 13% of adolescents aged 10 to 19 worldwide are reported to be diagnosed with mental health disorders. ![](https://i.imgur.com/nL6TNmp.png) Mental health disorders among modern people continue to rise. We decided to identify and analyze the types of mental health disorders people experience, perceptions of mental illness, and awareness of the issues, using YouTube text data as the basis for our research. Our team composed of three people. and this project conducted during the Data based-service design class. all member participated in every stage of the process, from desk research to data analysis. with responsibilities divided among the team. <br> #### Desk Research Trend analysis through desk research ![](https://i.imgur.com/lwBoYTD.png) Human centered thinking based on trend analysis. ![](https://i.imgur.com/b6XGakq.png) **After conducting desk research**, It predicted the policies and awareness would differ across countries. Collecting text data globally expected to be broad and time-consuming, so we decided to narrow the scope within the country. The next step was to ==**identify different perspectives on mental illness and metal health care**==, so we selected youtube videos to collect over 10,000 comments. <br> **YouTube Selection Criteria** 1. To figure out people's differing perspectives on mental illness and mental health, we searched using keywords such as "mental health," "depression," "mental illness in modern society," "mental illness crimes," "stress in modern life," "types of mental illness," and "mental illness awareness". 2. Collected data while considering both mild and severe mental illnesses to avoid bias toward either end of the spectrum. 3. Videos uploaded within the 4 years with at least 200 comments 5. To ensure diversity, avoiding duplication of channels. 6. Primarily selected videos with a high number of comments that revealed opinions or awareness about mental health and mental illness. <br> **Selected Youtube** - [์ •์‹ ๊ฑด๊ฐ•ํŠน์ง‘] ์šฐ๋ฆฌ๋Š” ์™œ ๋ถˆ์•ˆํ• ๊นŒ? ๋ถˆ์•ˆ์žฅ์• ์˜ ์‹ค์ฒด์™€ ์น˜๋ฃŒ๋ฒ• | ๋ถˆ์•ˆ์žฅ์•  ์—ฐ๊ตฌ์˜ ์„ธ๊ณ„์ ์ธ ๊ถŒ์œ„์ž ๋ณด๋ฅด๋นˆ ๋ฐ˜๋ธ๋กœ ๊ฐ•์˜ ๋ชฐ์•„๋ณด๊ธฐย  (10๊ฐœ์›” ์ „) [๋งํฌ](https://www.youtube.com/watch?v=HPkfrbst_10) - ์ •์‹ ๊ณผ ์ƒ๋‹ด ๊ณ ๋ฏผ์ด์‹  ๋ถ„๋“ค ํ•„์ˆ˜ ์‹œ์ฒญ!๐Ÿค” ์ •์‹ ๊ณผ ๋น„์šฉ, ๋ณดํ—˜, ํ•ญ์šฐ์šธ์ œ ๋“ฑ ์ •์‹ ๊ณผ์— ๊ด€ํ•œ ๋ชจ๋“  ๊ฑธ ์•Œ๋ ค๋“œ๋ฆฝ๋‹ˆ๋‹ค ๐Ÿ‘จโ€โš•๏ธ[์–‘๋ธŒ๋กœ์˜ ์ •์‹ ์„ธ๊ณ„] (11๊ฐœ์›” ์ „) [๋งํฌ](https://www.youtube.com/watch?v=dH3dUtbzpUA) - ์šฐ์šธ์ฆ์„ ๊ฒช๊ณ  ์žˆ๋Š” ๊ฒƒ์€ ๋‹น์‹ ์˜ ํƒ“์ด ์•„๋‹™๋‹ˆ๋‹ค. ์˜์ง€๋‚˜ ์ •์‹ ๋ ฅ์œผ๋กœ ๊ทน๋ณต ๋ถˆ๊ฐ€ํ•œ ์šฐ์šธ์ฆ, ์ฃผ๋ณ€ ์‚ฌ๋žŒ์ด "์ ˆ๋Œ€" ๋งํ•˜๋ฉด ์•ˆ ๋˜๋Š” ๊ธˆ๊ธฐ์–ด! [๊ฑด๊ฐ• ์ฝ์–ด๋“œ๋ฆฝ๋‹ˆ๋‹ค] | ๋…ธ๊ทœ์‹ ๋ฐ•์‚ฌ (2022. 9. 16.) [๋งํฌ](https://www.youtube.com/watch?v=JQT1xFBtDE8) - ์‚ฌ๋žŒ๋“ค์ด ์ž˜ ๋ชจ๋ฅด๋Š” ์šฐ์šธ์ฆ ํ™˜์ž ๊ฐ์ •์˜ ์‹ค์ฒด! ์ •์‹ ๊ณผ ์˜์‚ฌ๊ฐ€ ์•Œ๋ ค๋“œ๋ฆฝ๋‹ˆ๋‹ค (2022. 9. 30.) [๋งํฌ](https://www.youtube.com/watch?v=MZBTIt_lkwM) - ์„ฑ์ธ ADHD 80%๋Š” ๋‹ค๋ฅธ ์ •์‹ ์งˆํ™˜์„ ๋™๋ฐ˜ํ•œ๋‹ค | ADHD ํ™˜์ž์˜ ๊ฒฝํ—˜ | ์”จ๋ฆฌ์–ผ ์‹œ์„  (2021. 7. 14.) [๋งํฌ](https://www.youtube.com/watch?v=LptT728hqOc) - ์ •์‹ ๊ณผ ์˜์‚ฌ๊ฐ€ ์•Œ๋ ค์ฃผ๋Š” ๊ฒฝ๊ณ„์„  ์ง€๋Šฅ์˜ ์˜คํ•ด์™€ ์ง„์‹ค (2022. 9. 23.) [๋งํฌ](https://www.youtube.com/watch?v=_Z3B30lI9eg) - [๊ณ ๋ฏผ์‚ฌ์—ฐ] ์˜์‚ฌ์—๊ฒŒ ๋ฌผ์–ด๋ดค์Šต๋‹ˆ๋‹ค. '์šฐ์šธ์ฆ' ํ™˜์ž์™€ ๊ฒฐํ˜ผํ•ด๋„ ๋ ๊นŒ์š”? (2022. 4. 28.) [๋งํฌ](https://www.youtube.com/watch?v=ZYnTwSsw3Qk) - "์šฐ์šธ์ฆ ์•„๋‹ˆ๋‹ค" 2030์—์„œ ๊ธ‰์ฆํ•˜๊ณ  ์žˆ๋Š” ์ •์‹ ์งˆํ™˜ (2022. 5. 26.) [๋งํฌ](https://www.youtube.com/watch?v=FXsSrzy6JZY) - ๋Œ€ํ•œ๋ฏผ๊ตญ ์šฐ์šธ์ฆ OECD 1์œ„ (2022. 3. 15.) [๋งํฌ](https://www.youtube.com/watch?si=eQ6xe5mk6T5gZSrn&v=SXl_pnG0j94&feature=youtu.be) - [ํ™ํ˜œ๊ฑธ์˜ ์ธ์‚ฌ์ดํŠธ ์ธํ„ฐ๋ทฐ] #12 ์กฐํ˜„๋ณ‘๊ณผ ๊ฐ•๋ฐ•์ฆ์— ๋Œ€ํ•œ ํ†ต์ฐฐ (์„œ์šธ๋Œ€๋ณ‘์› ๊ถŒ์ค€์ˆ˜ ๊ต์ˆ˜ & ์˜ํ•™์ „๋ฌธ๊ธฐ์ž ํ™ํ˜œ๊ฑธ) (2021. 6. 26.) [๋งํฌ](https://www.youtube.com/live/i9peTnVxMeg) - ๋‹น์‹ ์€ ์ •๋ง ADHD์ธ๊ฐ€? ๋‡Œ๊ณผํ•™์ž๊ฐ€ ์ •์‹ ์งˆํ™˜ ์† ์‚ด์•„๋‚จ๋Š” ๋ฐฉ๋ฒ• (2021. 12. 6.) [๋งํฌ](https://www.youtube.com/watch?v=_zsv7nig9FA) - ๋‹คํ ์‹œ์„  - ์šฐ๋ฆฌ๋Š” ์กฐํ˜„๋ณ‘ ๋‹น์‚ฌ์ž์ž…๋‹ˆ๋‹ค (2019. 6. 27.) [๋งํฌ](https://www.youtube.com/watch?v=d6hsrARS-oYใ……) <br> ##### Text Data Crawling We crawled 10,032 comments, including replies, to use for the data analysis. ![](https://i.imgur.com/RtCVknK.gif) Since the spelling, spacing, and phrasing of words extracted from the comments varied, we had to standardize them, consolidate words with the same meaning, and remove stopword. 1. ๋„์–ด์“ฐ๊ธฐ ํ†ต์ผ ๋ถˆ์•ˆ์žฅ์•  = ๋ถˆ์•ˆ ์žฅ์•  ๊ณตํ™ฉ์žฅ์•  = ๊ณตํ™ฉ ์žฅ์•  ์ •์‹ ๊ณผ ์น˜๋ฃŒ = ์ •์‹ ๊ณผ์น˜๋ฃŒ ์šฐ์šธ์ฆ = ์šฐ์šธ ์ฆ ๋ถˆ์•ˆ ์š”์†Œ = ๋ถˆ์•ˆ์š”์†Œ ์•ฝ๋ฌผ์น˜๋ฃŒ = ์•ฝ๋ฌผ ์น˜๋ฃŒ ์ž์‚ด ์‹œ๋„ = ์ž์‚ด์‹œ๋„ ์ •์‹ ๊ณผ ๋ณ‘์› = ์ •์‹ ๊ณผ๋ณ‘์› ADHD = adhd = Adhd ๊ฒฝ๊ณ„์„ ์ง€๋Šฅ์žฅ์•  = ๊ฒฝ๊ณ„์„  ์ง€๋Šฅ ์žฅ์•  ๊ฒฝ๊ณ„์„ ์ง€๋Šฅ ์žฅ์•  = ๊ฒฝ๊ณ„์„  ์ง€๋Šฅ ์•Œ์ฝœ์ค‘๋… = ์•Œ์ฝœ ์ค‘๋… ์‹ฌ๊ฐํ•œ ๊ฒฝ์Ÿ = ์‹ฌ๊ฐํ•œ๊ฒฝ์Ÿ ์กฐํ˜„๋ณ‘ ํ™˜์ž = ์กฐํ˜„๋ณ‘ํ™˜์ž ์ •์‹ ๊ฑด๊ฐ• = ์ •์‹  ๊ฑด๊ฐ• ์šฐ์šธ์ฆ ํ™˜์ž = ์šฐ์šธ์ฆํ™˜์ž ์šฐ์šธ ์ƒํƒœ = ์šฐ์šธ์ƒํƒœ ์‚ฌํšŒ์ƒํ™œ = ์‚ฌํšŒ ์ƒํ™œ ์‹ฌ๋ฆฌ์น˜๋ฃŒ = ์‹ฌ๋ฆฌ ์น˜๋ฃŒ ์ •์‹ ์งˆํ™˜ = ์ •์‹  ์งˆํ™˜ ์ •์‹ ๊ณผ ์˜์‚ฌ = ์ •์‹ ๊ณผ์˜์‚ฌ ์ •์‹ ๊ณผ ๋ณ‘์› = ์ •์‹ ๊ณผ๋ณ‘์› ์ •์‹  ๋ณ‘์› = ์ •์‹ ๋ณ‘์› 2. ๋‹จ์–ด ํ†ต์ผ ์ž์‚ด๋ฅ  = ์ž์‚ด์œจ ๊ฒฝ๊ณ„์„ ์ง€๋Šฅ์žฅ์•  = ๊ฒฝ์ง€ ํ•œ๊ตญ = ๋Œ€ํ•œ๋ฏผ๊ตญ ๋ฌด๊ธฐ๋ ฅ = ๋ฌด๊ธฐ๋ ฅ์ฆ ์ •์‹ ๊ณผ = ์ •์‹ ๋ณ‘์› ์กฐ์šธ์ฆ = ์–‘๊ทน์„ฑ์žฅ์•  3. ๋ถˆ์šฉ์–ด ์„ ์ • ๋น„๋ˆ„ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค ๋ง๊ฑฐ๋ผ ์ข‹๊ฒ ์–ด์š” ๊ถ๊ธˆํ•˜๋ฉด ๋˜์š” ํ•ด๋ณด์…จ๋‚˜์š” ๋ผ๋ฆฌ ์™”๋‹ˆ ์ฐพ์•„๋ณด๊ณ ๋Š” ๋ ค๊ณ  ํ•˜๋Š”๊ฑฐ ์•„๋‹˜ ์•„๋‹ˆ์ž–์•„ ์žฌ๋ฏผ ๋งž๋ƒ ๋ณด์—ฌ์ฃผ๋Š”๊ฑด๋ฐ ์‹œํ‚ค๋ ค๊ณ  ํž˜๋‚ด์„ธ์š” ๋‚œ๊ฐ€ ๊ทธ๋ƒฅ ์•„๋‹๊นŒ ํ•˜์ง€๋ง์ž ํ—ท๊ฐˆ๋ฆฌ๊ฒŒ ์‰ฝ๊ฒŒ ์—†์œผ๋‹ˆ ๋ชจ๋ฅด์‹œ๋ฉด ์ œ๋ฐœ ๋งŒ๋“ค์–ด๋ผ ๋ณด์ด๋Š” ์—ฌ๋Ÿฌ ๋‹ค๋ฅด์ง€ ๋Œ“๊ธ€ ์‚ฌ๋„ค ์ฐพ์•„๋ด ์ €๋Ÿฐ๊ฑฐ ์‹œํ‚ค๋ฉด ๊ฑธ๋ฆฌ๋Š” ์ด์—ˆ์Šต๋‹ˆ๋‹ค ์•Š๋‚˜์š” ์ด๋Ÿฐ๊ฑด ์•„๋‹ˆ๋ƒ ๋ณธ์  ํ•˜๋Š”๊ฒŒ ํŒ”์ด ๋„ˆ๋ฌด ์•Œ์•„์„œ ๋งŽ๋„ค์š” ํ•ด์ฃผ๋ฉด ์ž–์•„ ๋ง๊ตฌ ๋ณด๋ฉด ... 4. ํ˜•ํƒœ์†Œ ์‚ฌ์ „์— ์‹ ์กฐ์–ด ์ถ”๊ฐ€ ๋“„ = EBS ๊ฐ€์Šค๋ผ์ดํŒ… ์ธ์‹ธ ์•„์Šคํผ๊ฑฐ ์—์ด๋””์—์ด์น˜๋”” = adhd ์— ์ง€ = mz ์ž์‚ด = ใ…ˆใ…… 5. ํ˜•ํƒœ์†Œ ๋ถ„์„ ![](https://i.imgur.com/JzIt8lA.png) <br> ### Text Data Analysis **The analysis was conducted on a total 7,544 comments organized through previous process.** the comments were divided into 5 topics, spread across 4 sections. ![](https://i.imgur.com/aJQyCUU.png) ![](https://i.imgur.com/UiNJ0Lw.png) After organizing main Keywords, Proportion, Coherence score for the five topics, **Topic1, Topic5** had the highest coherence scores (subject consistency) and proportion (document distribution by topic). so we focused more on Topic1 and 5 in the analysis compared to Topic 2, 3 and 4. ![](https://i.imgur.com/jApVwKG.png) Topic insight were gathered by referring to the Perc_contribution score from the keywords for each topic and the related comments. as higher scores indicate more relevant comments supporting the keywords, we were able to focus the analysis on the most related comments for each topic. ![](https://i.imgur.com/ind0oCN.png) <br> #### Visualize Data We compared and integrated the categorized insights to visualize key findings. ![](https://i.imgur.com/rbcrZfS.gif) 1. One of the unexpected results was that **personal stories were mentioned significantly** more often than societal systems. <br> --- ![](https://i.imgur.com/i7C4CXS.png) 2. People often described their own experiences with a sense of uncertainty, whereas they tended to **share othersโ€™ experiences with much more certainty**. <br> --- ![](https://imgur.com/8ei4mqm.png) 3. Perceptions of psychiatric care showed a similar balance of positive(17%) and negative views(10%), with neutral responses being the most common. In these case, people tended to focus primarily on sharing their personal experiences. <br> --- ![](https://i.imgur.com/pCytXmx.png) 4. Comments about mental illness were highly polarized between negative and positive views. Negative comments were typically stories about other people, while positive ones tended to share personal experiences or offer words of support. <br> --- ![](https://i.imgur.com/dDKK9Hg.png) 5. Positive and negative comments depending on mental illness types. While schizophrenia, depression, and borderline intellectual functioning drew mostly negative comments, ADHD showed a noticeably more positive tone, often framed through references to well-known public figures who have shared their experiences. <br> --- ![](https://i.imgur.com/bgiMqWQ.png) 6. Only 18% people who have experienced mental health issues use mental health platforms. The biggest reason for not seeking help was the belief that their condition wasn't serious to need treatment. --- #### Insight **Negative views on mental illness were mostly related to how others perceived it.** When it came to their own experiences, people often shared their stories or expressed support and empathy, reflecting a more positive perspective. In contrast, mental illness in others was often viewed negatively, likely because individuals felt it was unrelated to themselves. Additionally, when discussing their own symptoms of mental illness, people tended to use uncertain language. Further research revealed that even when experiencing mental health issues, the rate of utilizing mental health services was significantly low. However, when it came to symptoms in others, individuals often spoke with a more directly and were rush to make judgments. As a result, we identified that modern individuals may not have an accurate understanding of their own mental health, and it can be inferred that they afraid how others perceive them. This led us to raise the following issues. **Modern individuals need to have a clear understanding of their own mental state.** ADHD, depression, and paranoia are examples of mental illnesses that individuals might be experiencing without realizing or acknowledging them.ย **Because they don't think it's their issue**, they often lack empathy and view mental illness negatively. However, mental illness is something thatย **many modern individuals are actually experiencing, and its prevalence is increasing. If more people recognize that it is a condition anyone can face**, the negative perspective could diminish, and the barriers to seeking treatment could be reduced as well. --- We analyzed perspectives and differences of awareness of mental illness based on YouTube text data. While the data is limited to YouTube, expecting the investigation globally could provide different insights from each country. Especially in the case of the U.S, where the barrier to psychological counseling is lower compared to Korea. It would be interesting to compare the differences in system and public awareness. The process of finding design solutions based on the analysis was not included in this project. Instead, the focus was on handling and analyzing the text data, which was the primary goal of the project.