Understory 2021

Informal Writing in Social Media Posts: Deviation Patterns by Age and Gender

Introduction 

Social media is a relatively new creation when considering all that has occurred in the  history of language. In congruence with this creation, a somewhat unnoticed mode of  communication has flourished: informal writing. For the purposes of this research, I’ll refer to  this informal writing as “non-standard” or a “deviation” from what would normally be  considered proper or standard writing. Social media platforms have created a place for people to  write informally, or in other words, have provided a place for people to emulate their social  speech through written text. Speech, in this context, is meant to refer to all the nuances that come  with understanding a message (e.g. tone of voice, body language, sarcasm). The purpose of this  research is to show how often deviation is performed, how these deviations vary by age and sex,  and why people may be performing specific deviations. 

I will draw three hypotheses as to what the data will uncover. The first of which is that  the youngest age group (13-17 years) will perform the highest percentage of deviation in their  writing, and that these deviations will primarily be in the form of capitalization. This is not to  suggest that they won’t also perform other deviations frequently, just that their capitalization  deviation frequency will be higher than any other group. A second hypothesis is that frequency  of deviation between sex will vary throughout age groups. In other words, one sex may have a  higher deviation percentage in a given age group, but in a different age group would have a lower deviation percentage. A third hypothesis is that the oldest age group (26-40 years) will use  informal contractions more than the other three categories of deviation. Informal contractions are  (arguably) the most used informal element of speech. They are used so unconsciously that people  rarely notice using them anymore. Since these contractions have become so common in speech, I  believe the oldest age group will also use them more frequently in informal writing. This is just  an assumption, though, as there is no way to tell whether or not the subject intended to deviate in  the informal contraction or if it was done unconsciously. It’s also important to note that the most  popular informal contractions are not autocorrected or red-underlined by spellcheck. 

There are two assumptions that must be made for this research as well. One important  component of this research is that the writer of the text understands they are deviating from  standard English to write in a non-standard form. If what I believe is true about unconscious use  of informal contractions, then it’s difficult to claim deviation, but since I have no way of  knowing if the informal contractions were done unconsciously, they will be marked as deviations  for the purposes of this research. Deviation is not the same thing as making an error, so any non standard usage that is not a conscious decision would not be considered deviation. Another  assumption is that the sample size performed to create these data are representative of social  media populations as a whole, which may or may not be the case. 

Method 

Data gathering was the first step in providing the material necessary to prove or disprove  the hypotheses. It was determined that the same number of subjects would be needed in each  group, and that each group should have the same number of males to females. This study has  twelve (12) males in each group and twelve (12) females in each group, with three separate  groupings by age: 13-17, 18-25, and 26-40, which results in 72 total subjects. Each subject would have five (5) of the most recent posts from their account documented for analysis, for a  total of 360 social media posts. These posts, however, would need to be deemed substantive in  order to count towards the data in this study. An example of a non-substantive post would be the  subject giving a one-word caption for a picture the subject had posted. These types of posts were  not included because they would misrepresent the scope of this research. Any other post that  used words to form some type of complete thought would be documented for analysis. Also, if a  subject did not have five substantive posts in total on their account, they were not selected for the  study. This means that the subjects’ accounts needed to be vetted for content before being  selected for the study.

To find study participants, I searched for keywords or issues that certain age groups  would most likely be talking about. Once I was able to find a subject and estimate their age  accurately, I used their friends/followers list to find other subjects in the same age group until I  had twenty-four subjects in each group. In order to maintain the integrity of the data, I used  multiple friends/followers sets from multiple individuals to gather subjects. Not doing this could  result in skewed data because of the potential for groups of friends to follow each other in their  trends. By finding three or four subjects not known to each other, and then finding other subjects  through the initial three or four, I was able to maintain the integrity of the data. This process was  repeated for each age group until enough subjects were found.  

There are three other notes that should be taken in terms of data gathering. Number one:  the subjects had no knowledge of my data gathering, nor did I request friendship with, follow, or  converse with in any manner before, during, or after the gathering of data. This was meant not  only for liability issues but also for the integrity of the data. Socioeconomic and sociocultural  status were not taken into consideration. While this distinction would have allowed for more clarity and additional hypotheses, it was not feasible with time constraints. Number two: the age  of each subject needed to be estimated in order to place them into a categorized age group. This  was done based on the subject’s physical appearance and on the content of their posts. The ages  are strictly estimations, though I’m confident in the accuracy of the estimations. At no time have the actual ages of the subjects been known. Number three: self-identification of sex and sexual  orientation were not taken into consideration for categorizing deviations. 

Once all subjects were identified and their posts collected, the total number of words  written by each subject were documented in a table. I categorized four different types of  deviations for this study: acronyms (e.g. BTW), capitalization (e.g. david), informal contractions  (e.g. gonna), and non-standard spelling (e.g. todae, instead of today). The first three of these  categories are straightforward; however, in the interest of simplicity, I found it necessary to  group multiple different variations of deviation into the “non-standard spelling” category. These  include: lengthened words, shortened words, abbreviated words, and words that would be  considered slang words. Lengthened words are any words that maintain proper spelling but add extra letters at some point in the word (e.g. reaaaaaallllyyy). Shortened words are those that  follow the same spelling convention but are cut off before finishing the word completely (e.g.  bro). Abbreviated words are those that remove certain letters to shorten the word (e.g. u instead  of you; frgt instead of forgot). These four instances (acronyms, capitalization, informal  contractions, and non-standard spelling) were counted and documented in the corresponding  table for each subject.  

For data calculation, I divided the number of deviations in each separate category by the  total number of words written by the subject, so if there were four capitalization deviations out of  300 total spoke words, I would divide 4 / 300 for a 1.3% deviation in capitalization. I would perform this calculation for each category. I then performed a total deviation percentage by  dividing the total number of deviations (acronyms, capitalization, informal contractions, and  non-standard spelling) by the total word count of the subject. This process was completed for  each subject and percentages were noted in a table. Once all individual subjects had been  calculated, I calculated total group deviations for each category and the deviation percentage as a  whole. This calculation was completed by counting the total number of deviations for each  category in each of the six groups and dividing by the total number of words written by the  designated group. For a hypothetical example, if the 13-17 year old female group spoke a total of  2500 words and had 100 capitalization deviations, I would divide 100 / 2500 for a total group  deviation percentage of 4%. Once all calculations for each individual subject and each group  were completed, I would follow the same calculation to find the total deviation percentages of all  the groups combined. I would then use these percentages to make my findings and assumptions  for this research report. 

Results and Analysis 
Table 1.1

Informal Contractions 
Informal contraction usage does decline with age, though the decline is minimal. The  consistency in informal contraction deviation suggests that there is no social pressure to  eliminate these deviations in informal writing. People often speak using informal contractions, so it doesn’t come as much of a surprise that they are also accepted in informal text. If informal  contractions were non-standard in speech, it could be assumed that the deviation percentage  would lower by age group as well. One of my hypotheses was that informal contractions would  be the most prominent deviation among the 26-40 age groups, though this was not the case, as  non-standard spelling was actually performed more frequently. 

Non-Standard Spelling 

The 13-17 male group has a very high deviation percentage in non-standard spelling  (5.1%) in comparison to the other groups, as well as a higher percentage in the other three  deviation categories. With the exception of the 13-17 year old males, non-standard spelling  remains relatively constant throughout the age and sex groups. Probably the most surprising  information in the data table is the lack of deviation in spelling by the 13-17 year old females, as  they more so align with the two older male groups and the 26-40 female group.  

Acronyms 

Acronym usage never rises above 1%, for any group, making it a relative non-factor in  terms of deviation.  

Capitalization 

Capitalization deviation is the most frequent deviation, which supports my first  hypothesis in this study. The 13-17 male group in particular has a very high percentage in  comparison to the other groups. The largest decline in deviation comes in the form of  capitalization. Females 13-17 deviate in capitalization 4.6% of the time, while males of the same  age deviate 6.6% of the time. All other groups are under 3%. This suggests that the need to  adhere to capitalization standards increases as people age. The motivations behind this adherence  likely stem from social pressures (e.g. present as better educated, look less immature, etc.). 

Interestingly, the drop off in deviation between the two youngest male groups (13-17 and 18-25)  is 4.5%, which is the largest drop off in deviation usage between any age group. Males between  the ages of 26 and 40 also have the lowest deviation percentage in capitalization of any group in the study, at 0.1%. Both the male and female group 26-40 had higher non-standard spelling  frequency than capitalization, which were the only groups to deviate more in a category other  than capitalization.  

Big Picture 

My second hypothesis for this research was that frequency of deviation between sex  groups will vary throughout age groups. This hypothesis proved to be true, and the most  surprising instance of this was the increase in non-standard spelling between the 13-17 and 18- 25 year female groups. The 18-25 females actually increased in their non-standard spelling  percentage, with the 13-17 group at 0.8% and the 18-25 group at 2.0%. This was the only  instance in which there was an increase in deviation percentage from one male/female group to  the next older group. 13-17 year old males deviate more than any group in every category with a  total deviation rate of 13.9%. This doubles the rate of the next closest group, and, interestingly  enough, 26-40 year old males have the lowest total deviation rate at 1.4%. The data shows us that  females have a steady decline in deviation, but never reached the same heights as the males at the peak deviation age.  

Table 1.2

When performing an initial read-through of all the posts analyzed in this research, I  assumed the percentages of deviation would be much higher for each group, save the 26-40 male  and female groups. The relatively low percentages in deviation were actually a surprise given all the conscious “errors” I had seen before data analysis. The biggest reason for this is likely that  I’m not used to reading social media posts and this made the posts seem riddled with non standard writing. Much of it stuck out like a sore thumb, so the initial reading made it appear  worse than it really was (See Table 1.3 below for a visual of deviation v. non-deviation). A second note on data gathering was my frequency in skipping over common informal contractions  (e.g. gonna, wanna, etc.). It was more difficult for me to notice these inconsistencies because I  personally use these deviations in informal writing and texting. My own informal writing  inconsistency required me to take special care to find the more common informal contractions in  the social media posts.  

Table 1.3

Sex had very little to do with deviation percentages, except that the non-standard spelling  deviation for females 13-17 was much lower (averaging out similarly to other groups) than the  males 13-17. I can think of no explanation as to why this deviation percentage would be so low compared to the males 13-17. 

Inferences

I would argue that non-standard spelling and informal contraction usage stem from the  writer’s desire to place tone and phonetic presentation on the text. It’s a way of allowing readers  to see the tone in the text. An example from a 13-17 year old male subject shows this when he  says he’s, “puttin’ on da pressha.” Notice also his consciousness of shortening the word putting by placing an apostrophe after the letter [n]. His placement of the apostrophe shows that his  shortening of the word had nothing to do with typing fewer keystrokes, but that it was meant to express the way he would say it aloud. The word da is meant to represent the word the, as da is a  popular replacement word for the in informal writing and speech, and it’s featured in a number  of Hip/Hop songs. Pressha is meant to be the word pressure. The spelling of this word is  interesting in that the first syllable of this word follows proper spelling conventions, while the  rest of the word is written differently so that the reader will deviate from regular pronunciation.  It can be assumed that this sentence was written to evoke a specific phonetic deviation, a  deviation that he likely would have expressed in speech had he been speaking aloud. I make this  inference because of the way children of this age are often trying to find their own identity and/or present to their peers a certain identity. Deviation in both spoken and written forms  expresses a rebellious attitude, and those in the 13-17 year age group typically view rebellion as  a factor in being labeled as a cool kid and working their way into adolescence.3 Another  supporting feature of this influence is the way people are attempting to express tone in their  social media posts. Typically, written texts are more formal in nature and aren’t meant to express  emotion or changes in tone. This changes entirely, though, with the informality of social media  posts, as writers feel the need to express their tone through the text. One reason for this is  because the text is all they have to express their tone, with the exception of punctuation, which is  beyond the scope of this research. In Gretchen McCulloch’s book, Because Internet:  Understanding the New Rules of Language, she states, “We’re creating new rules for  typographical tone of voice. Not the kind of rules that are imposed from on high, but the kind of  rules that emerge from the collective practice of a couple billion social monkeys – rules that  enliven our social interactions.” From this, we see that tone is meant to be expressed through the deviations performed on social media posts, which are typically in the form of non-standard  spelling and informal contractions. If the male subject was to write, “putting on the pressure,” the  message would not be read as intended because the readers can’t see any of the tone or phonetics in the writing. Given this information, it wouldn’t be inappropriate to assume that deviation  percentages should be higher in all groups, as tone of voice is an important marker in speech,  though the percentage remains rather low except in the 13-17 male group.

Capitalization deviation follows a similar line to that of non-standard spelling for the 13- 17 age groups. A lack of capitalization is viewed as rebellious in nature, and while it could also  be a lack of willingness to put forth the effort to hit the shift key, there is no evidence to suggest  that those writing on social media are typing for speed or ease, but rather a way of sending their  own message with their own flair. The results of this research show that capitalization deviation  decreases as subjects age, which also supports the belief that not capitalizing words stems from a search for identity and/or a way to follow the cool, social norms of peers. If this were not the  case, older age groups would, in theory, also maintain a similar capitalization deviation.

An explanation for the decline in total deviation as people age stems from the process of  maturity and the desire to appear more professional and intelligent. As people age, they feel  more pressure to adhere to standard writing forms. This pressure comes with the desire to appear  more marketable in the professional world. This is especially important now that we’re in a time  when employers may examine the social media posts and activity of potential employees. Not  only do they wish to appear professional to potential employers, they also wish to present as intelligent to their peers. At some point in the life of the subject (seemingly between the ages of  18-25), their mind switches from thinking they need to deviate from standard writing forms to thinking they need to follow standard writing forms. This points directly to the way maturity affects literacy presentation. This is an interesting finding in the way that those performing non standard deviations are conscious of their decision to deviate. Those reading the posts understand  that the individual is using deviations, and the reader likely wouldn’t assume those deviations are performed because of a lack in intelligence. Qualitative data and knowledge in psychological  theory would be required to confirm the inferences I’ve made with the results from this research. 

Conclusion

Understanding informal writing on social media is a difficult undertaking. The primary  aim of this research wasn’t to find out why people write informally, but what age and sex groups  were writing informally with the most prevalence. It’s clear from the data that deviation  percentages decrease with age and that total deviation is relatively low within each group. The  only real surprising data of deviation between males and females came with the 13-17 year old  subjects. Females rarely performed non-standard spelling, while males of the same group did  perform non-standard spelling over 5% of the time. This rise in non-standard spelling by 13-17  year old males also rose their total deviation percentage to 13.9, which was double that of the 13- 17 year old females. Capitalization had the highest deviation percentage with all groups  combined, coming in at 2.48% of total deviation. Non-standard spelling was second with 1.46%  (refer to table 1.3 for visual). The inferences made under the inferences section propose that  much of the reasoning behind informal writing comes from how one thinks one will be perceived  through the text. The phonetic presentation and tone of words aims at putting the writer’s voice  in the mind of the reader, which primarily connects to informal contractions and non standardized spellings. The writer in these instances wants the reader to get a certain impression  or feeling from the reader, so they alter the presentation of the text to form that impression. One  set of data I could have gathered to help with qualitative analysis would have been to document and calculate all the deviations that would have represented a change in tone or phonetic  presentation. Not all spelling deviations result in the changing of tone, so this percentage would  have been different, and could have offered an additional avenue of analysis. Capitalization deviation could come from rebellion, laziness, or individuality. Individuality seems unlikely  given the frequency of use by subjects, though laziness, too, seems unlikely, since nothing has  been shown to suggest people are trying to get messages out more quickly. Perhaps it’s as simple  as people that age want to feel like they’re going against the rules, which would support Finders’  thoughts on the process of adolescence. Neither of these inferences can be confirmed without  more quantitative and qualitative data, which I will offer as a continuing study to the research in  this report.
 
Works Cited 


Finders, Margaret J. Just girls: hidden literacies and life in junior high. New York: Teachers  College Press, 1997. Print.

McCulloch, Gretchen. Because internet: understanding the new rules of language. New York,  NY: Riverhead Books, 2019. Print

                                                                  
TOMMY BROWN is a senior pursuing a Baccalaureate in  English with a minor in History. Selected by Professor David Bowie.

This page has paths: