Observable Patterns in Conversations about Ilhan Omar

26 min read

A. Summary

This data analysis looks at conversations on Twitter about Ilhan Omar that occurred in June, July, and August. Specifically, this analysis examines 4 spikes in conversation on Twitter, and looks at the accounts partipating in each spike, and the top domains and YouTube videos shared by participants in the conversation. Collectively, these 4 spikes in conversation make up 1.19 million tweets.

Data used for this analysis was collected from a twitter search for Congressperson Ilhan Omar's name: "Ilhan Omar." The search did not use any other hashtags or terms.

The analysis shows several trends:

  • Shares to YouTube dwarf shares to other domains. In each spike, YouTube was the most popular external domain shared - collectively, in all four spikes, at least 10,756 YouTube links were shared.
  • Aside from YouTube, the right wing site Gateway Pundit is the most popular domain shared. In three out of the four spikes, Gateway Pundit was the most popular site shared. In the fourth spike, Gateway Pundit was the second most popular site shared.
  • 260 accounts were highly active in all 4 spikes (in the 95th percentile or greater as measured by post count). 246 accounts (94.6%) are right leaning to far right, compared to 10 accounts (3.9%) that were mainstream to left wing.
  • The most popular YouTube shares in each spike trended hard right. Out of the top 16 YouTube videos shared, only one was from a left leaning source (Now This News); the remaining fifteen were from right wing sources, including some sources known for sharing exremist content and misinformation. Additionally, YouTube's recommended videos reinforced right leaning to far right perspectives, so once a person landed on YouTube the video recommendations would keep them firmly rooted in a right wing perspective, or an extremist/white supremacist perspective.

The four spikes in conversation that occurred in the summer of 2019 show multiple ways in which the right and the far right dominated the conversation about Ilhan Omar on Twitter, and how that imbalance extended onto YouTube.

This analysis does not look at corresponding activity on Facebook, and this analysis does not look extensively at whether or not any accounts are engaging in coordinated misinformation efforts.

B. Introduction

In this analysis, we will look at 4 spikes in conversation that have occurred about Ilhan Omar in June, July, and August. This analysis uses Twitter as a starting point, and also examines YouTube shares. Each individual spike is described in more detail below.

This analysis focuses on three things for each spike:

  • levels of participation among the most active accounts;
  • domains shared within the data set, and
  • top YouTube videos shared.

These general, and distinct, indicators help provide an initial sense of the source material used to inform the conversation.

At the end of the analysis of the four spikes, I also examine the apparent ideological leanings of the accounts that were highly active (in the 95th percentile or greater) across all four spikes.

In this analysis, I will generally not be identifying individual accounts for two main reasons:

  1. precise attribution is difficult; while some accounts within this dataset clearly appear to be inauthentic, I prefer to err on the side of caution. If/when an authentic account is incorrectly labelled as inauthentic, it can direct destructive attention toward that account. The short version: I'm personally not okay with doxing.
  2. issues related to misinformation go beyond individual accounts. Patterns are interesting, and individual accounts are rarely of interest in their own right, but they are of greater interest when they can be situated within a pattern.

In rare cases, if or when an individual account does help illustrate a larger point, I reserve the right to use an individual post, but this will be rare, and generally only when the account in question is verified, and/or belongs to a public figure, and/or has been active in spreading misinformation. However, these instances will be rare, and in most cases if or when I use an individual post as an example it will be stripped of as many non-relevant details as possible.

C. Questions Asked and General Notes

This section provides context and some general notes on methodology used in the analysis. If you want to read this later and skip straight to the analysis of the spikes, head right this way!

C1. Who/what is creating the buzz?

To help get a rough sense of how participation in this conversation unfolds, I calculate what percentage of accounts participating in the conversation create 10% of all posts in the spike. This number is a rough proxy for how top-heavy a conversation might be: in a balanced conversation, 10% of participants would create 10% of the conversation.

It cannot be emphasized enough that these numbers are a very rough proxy for how engaged the most engaged participants are, and these numbers are best understood as indicators of other things to look for, rather than as meaningful in their own right. On Twitter - as in life - conversations can be dominated by very loud or active participants. The gap between the percent of participants and percent of the overall conversation can be an interesting indicator. When the percentage of participants edges closer to 10%, it can suggest more balanced participation across accounts. When percentage of participants is smaller, it can indicate a more frenzied conversation, higher participation by spambots (on or off topic), or other forms of artificial manipulation.

However, to re-emphasize this point: these numbers should only be understood as potential indicators. Additionally, the search terms and filters used to generate a data set can affect what these numbers look like, which makes it difficult to use these numbers to make apples to apples comparisons across data sets generated from different search terms. I am including these numbers here because they provide some context, but they should be considered rough indicators, at best.

C2. What domains are shared?

The domains used as sources within a conversation can provide a rough indication of the perspectives and ideological leanings of participants. Collecting the list of domains is the easy part. Coding those domains on a scale that measures (or approximates a measure of) ideological leaning is more difficult, and generally satisfies no one. However, it's a necessary element of the work, and I am attempting to be as clear and transparent as possible about how domains are coded.

For this analysis, I created two general groups using the spectrum of political right to political left. At the outset, I want to be clear that this definition is an oversimplification. However, for the purposes of this analysis, the oversimplification embedded in this coding is both a strength and a weakness - while there are going to be fringe cases that don't fit cleanly within this coding, the general structure is simple to the point where it is easy to use and easy to understand.

The two general categories are:

  • mainstream to left leaning to far left;
  • and right leaning to far right.

In determining where a publication stood on the spectrum from far right to right leaning to mainstream to left leaning to far left, publications like USA Today, the AP, and Reuters are considered mainstream. Sources like CNN, the NY Times, and the Washington Post, which are generally mainstream but, in aggregate, lean left, are included in the "mainstream to left leaning to far left" group. Sites like "Mediaite" and "Raw Story" have an editorial direction that is strongly to the left; these sites also share stories and headlines designed to be clickbait, and/or to misrepresent the facts of an issue to fit a political or ideological narrative. Publications like the New York Post and Wall Street Journal, which consistently swing right, are included as mainstream sources and are coded within the "mainstream to left leaning to far left" group.

In general, for a source to be considered right leaning or far right, it needed to be to the right of the Wall Street Journal or the New York Post. Fox News (discussed in more detail below) is coded within the "right leaning to far right" group, where Fox affiliates -- who often have more balance and a degree of editorial independence -- were coded as within the mainstream group. Sites associated with known racists or far right activists were coded within right leaning and far right.

Advocacy sites were coded within the political affiliation that most closely aligned with their advocacy. I also used Media Bias Fact Check to check my coding. This writeup also contains the list of the top 50 domains shared in each spike, and that list includes my coding so it can be checked for accuracy and argued over indefinitely

The decision about where to code Fox News was surprisingly difficult. My initial tendency -- largely because of the presence of voices like Sean Hannity, Tucker Carlson, Lou Dobbs, Laura Ingraham, Jeanine Pirro, etc -- strongly indicated that Fox should be included within right leaning to far right. However, there are a small number of journalists in their news unit (looking at you, Shep Smith) who, while definitely leaning right, have committed acts of actual journalism.

However, this question was simplified by how Fox shares its content on YouTube. On YouTube, Fox shares it's opinion hosts -- many of whom share biased, racist, misogynistic content, and/or blatant conspiracy theories -- under the "Fox News" name.

Fox News opinion hosts

This clear connection of the news side and opinion side on their YouTube presence -- which has over 3 million subscribers, and millions of views on its videos -- simplifies the decision, and was the deciding factor in grouping Fox News into the "right leaning to far right" group.

The coding in this analysis should be understood as a rough grouping of political leaning, and this coding stops short of determining whether or not a site shares false, misleading, or inaccurate stories. In some cases, if a site has a clear track record of spreading misinformation, that is noted in the analysis.

C3. What does it mean when google.com shows up in a domain list?

The domain "google.com" shows up in the listing of top domains; this is generally an indication of a hamfisted and amateurish setup of Google's "accelerated mobile pages" - more information available here.

The National Review provides a great example of this incompetence in action: https://www.google.com/amp/s/www.nationalreview.com/news/link-to-misinformation/amp. In this example (with the full url changed so as not to provide more visibility to any stories), you can see that google (dot) com shows up as the primary domain. This is a common trait among both less reputable sites, and reputable sites with sub-par technical implementation: because the main domain shows up as "google (dot) com" the site will generally show up as "trustworthy" regardless of whether or not the site is reliable. This is one of several ways that AMP is not good.

C4. What does it mean when twitter.com shows up in the domain list?

Links to twitter.com indicate that people are sharing and amplifying individual tweets, which can be indicative of echo chambers and/or highlighting accounts to swarm. Additional analysis of accounts sharing links to other Twitter URLs is required to gauge whether or not there is any level of artificial or coordinated signal boosting among these accounts.

C5. Analysis of YouTube Shares

For each spike, I examine the top 4 YouTube videos shared. This analysis looks at:

  • the number of shares on Twitter to the video
  • the source of the video
  • number of plays for the video.

For the first and fourth most popular videos in each spike, the analysis includes a breakdown of the recommended videos in the sidebar, up to a maximum of 9.

D. Spike One: June 22nd through June 29

This initial spike coincided with congressional delegations visiting border camps holding people seeking asylum in the US. Multiple congresspeople compared the dehumanizing conditions of these camps to concentration camps, and the story was gaining increased visibility via press coverage. Additionally, on June 20th, the NY Times put out a story profiling some of the people having racist reactions to Somali refugees resettling in Minnesota.

The subject of this spike was misinformation about Ilhan Omar's past.

Spike 1 posts

Between June 22nd and June 29th, 175,260 tweets came from 80,365 accounts.

The top 544 most active accounts - .68% of all active accounts in this spike - created 10% of all content in this spike.

Across all accounts, approximately 1500 unique domains were shared a total of 21,314 times.

A scan through the top 20 domains shared over this time period show a strong skewing toward right wing sites, with multiple sites of conspiracy theorists and far right figures appearing above mainstream sites. In the top 20 sites, 3828 shares point to 13 different right leaning or far right domains. 899 shares point to mainstream or left leaning content -- and all of those shares are from one source, the Star Tribune, the local paper in Ilhan Omar's district. Six of the domains in the top 20 were either link sharing services or links to other social media sites like Facebook or YouTube.

Out of the top 50 sites, 28 domains were right leaning or far right; these 28 domains were shared 4732 times. 8 domains were mainstream to left leaning to far left, and these domains were shared 1297 times. When we look at individual examples, fringe conspiracy sites and outright hate sites were shared in greater numbers than mainstream news sites. For example, links to Pam Geller's site were shared 184 times; Laura Loomer's site was shared 147 times; Infowars was shared 75 times; and the New York Times was shared 56 times.

Links to YouTube videos dwarf shares to other domains, with 1776 shares to YouTube. The next most popular domain after YouTube is Gateway Pundit, with 1328 domain shares. The Star Tribune - a local paper in Minnesota which is considered both mainstream and reliable - was shared 899 times.

The full list of the top 50 domains is included below.

The top 4 YouTube shares are listed below. The top 3 videos - shared collectively 406 times during this spike - all point to right leaning to far right content. The 4th most shared video - shared 91 times during this spike - is from Now This News, a progressive organization.

A look at the YouTube pages for these videos, however, suggests that the sharing of the link from Twitter is just the beginning. The screenshot shared below of the Rebel Media video was taken on August 20th from a clean browser while not logged in to YouTube. The 9 recommended videos at the top of the list include:

Spike 1 - video 1

  • 4 links to Fox News
  • 1 link to CNN
  • 1 link to Piers Morgan
  • 1 link to Channel 4 News
  • 1 link to Star Parker
  • 1 links to Vice "debate"

If a person comes to this video, the main options presented to them slant heavily to right leaning to far right perspectives.

Looking at the only progressive video in the top 4 most shared -- which was the 4th most popular video shared -- the breakdown of the 9 top recommended videos on the "Now This News" video include:

Spike 1 - Video 4

  • 3 links to Fox News
  • 2 links to MSNBC
  • 1 link to a Bill Maher interview with Ben Shapiro
  • 1 link to CNN
  • 1 link to C-SPAN
  • 1 link to The Daily Show

The YouTube recommendations on these videos include a small number of mainstream to left leaning sources, but the majority of recommendations are to right wing sources.

E. Spike Two: July 9th to July 13th

This spike appears to be sparked by a Tucker Carlson segment where Carlson continued his pattern of using racist smears as a core element of his program.

Spike 2 posts

In this spike, 90,788 accounts posted 209,404 times over 5 days (July 9-13).

The top 703 most active accounts - .77% of all active accounts in this spike - created 10% of all content in this spike.

In this time period, approximately 1450 domains were shared 20,702 times.

Out of the top 20 domains shared, 3238 shares pointed to 11 different right leaning or far right domains. 790 shares pointed to 4 different mainstream or left leaning domains.

Out of the top 50 domains shared, 3803 shares pointed to 19 different right leaning to far right domains. 1604 shares pointed to 16 mainstream to left leaning to far left domains.

As with the first spike, links to Twitter and YouTube dominated shares, with 7079 and 1391 shares, respectively. Fox News, The Gateway Pundit, and Breitbart were the next most popular domains, collectively shared 2062 times. In comparison, the most popular mainstream to left leaning domains (Huffington Post, Mediaite, and Microsoft News) were shared a total of 625 times.

The Western Journal - a far right site run by a political activist who was responsible for the Willie Horton ad and who currently runs a PAC with Herman Cain - was shared 198 times. In comparison, the Washington Post was shared 165 times.

The most popular YouTube videos slant heavily toward right leaning and far right sources as well. The top four YouTube shares all point to videos that represent right wing perspectives.

The most shared video - from the Next News Network - has recommended videos that are almost exclusively right wing. The recommended videos include:

Spike 2 - Most shared on YouTube

  • 6 from Fox News
  • 1 from NBC News
  • 1 from "Valuetainment"
  • 1 from "Pure living for life"

The fourth most popular video - from an account named "Contemptor" - follows the same pattern. Recommended videos include:

Spike 2 - 4th on YouTube

  • 5 to Fox News
  • 1 to a Bill Maher interview with Ben Shapiro
  • 1 to a video of Ann Coulter calling feminists "angry man-hating lesbians"
  • 1 to CNN
  • 1 to CBS News

The second spike has very similar patterns to the first spike: the share of domains is heavily slanted to right leaning and far right content. Top YouTube shares are nearly exclusively to right leaning or far right content, and the recommended videos from the top shares are heavily weighted to right leaning or far right sources.

F. Spike Three: July 13th to July 18th

The third spike picks up where the second spike ends, and includes the time period that includes Trump telling Ilhan Omar and three other congressional representatives to go back to "the totally broken and crime infested places from which they came."

Trump comments

While both the second and the third spike include parts of the 13th, the second spike ends at 03:00 on the 13th, and the 3rd spike picks up at 04:00.

Spike 3 posts

In the time period between July 13th and July 18th, 232,293 accounts posted 622,855 tweets over 6 days.

The top 1605 most active accounts - .69% of all active accounts in this spike - created 10% of all content in this spike.

In this time period, approximately 3450 domains were shared 81,815 times.

Out of the top 20 domains shared, 8325 shares link to 7 right leaning or far right sources. 5401 shares link to 6 different mainstream or left leaning or far left domains.

Out of the top 50, 12,294 posts linked to 21 right leaning or far right domains. 7995 shares linked to 16 mainstream to left leaning to far left domains. In this third spike, shares to right wing sources still dominate shares to left wing or mainstream sources. The Gateway Pundit - a far right site that regularly spreads misinformation - was shared 4780 times; this is more than the combined total of the top 4 most shared mainstream to left leaning to far left sites (Huffington Post, the Star Tribune, Wall Street Journal, and The Guardian), which were shared a total of 4400 times.

Shares of YouTube videos continue the right leaning to far right domination seen in the first two spikes.

In the third spike, 5882 total posts share links to YouTube, and the top 4 videos shared all represent right wing viewpoints.

The recommendations from The Blaze link almost exclusively to right leaning or far right sources:

Spike 3 - The Blaze

  • 6 from Fox News
  • 1 from Vice
  • 1 from Black Pill
  • 1 from Glenn Beck

The ads and recommendations from The Next News Network video point primarily to right wing content. For this video, an ad cut one video out from the top screen, so we only have eight video recommendations.

Spike 3 Next News Network

  • 6 from Fox News
  • 1 from "enduringcharm"
  • 1 from Vice

G. Spike 4: August 15 - August 17

This spike was triggered by Israel refusing entry to Ilahn Omar and Rashida Tlaib, and President Trump's two tweets supporting a foreign nation over two elected congresspeople.

Spike 4 posts

In the time period between August 15th and August 17th, 93,844 accounts posted 188,656 tweets over 3 days.

The top 826 most active accounts - or .88% - created 10% of all content

In the fourth spike, approximately 2050 domains were shared 32,780 times.

Out of the top 20 domains shared, 3514 shares link to 6 right leaning or far right sources. 2370 shares link to 5 different mainstream or left leaning or far left domains.

Out of the top 50 domains shared, 4864 posts linked to 17 right leaning or far right domains. 4201 shares linked to 19 mainstream to left leaning to far left domains. In this fourth spike, the count of total shares to right wing sources still dominate shares to left wing or mainstream sources - despite that in the top 50 domain shares there are 2 more mainstream to left leaning domains.

The fourth spike follows the patterns of the first three spikes, with links to right wing domains publishing dubious or outright racist and/or extreme content being shared at a higher volume than links to mainstream or left leaning or far left content. The Gateway Pundit was shared 1077 times, more than twice the total of shares to the NY Times, the most shared mainstream to left leaning site, which was shared 517 times. The Western Journal was shared 190 times, and links to Laura Loomer's site were shared 154 times; links to the Washington post were shared 151 times.

In the fourth spike, 1707 posts shared links to YouTube videos. As with the other spikes, the most popular videos all featured right wing content, including content from sources known to push misinformation.

In looking at the videos and links shared on the first screen with the top shared video from Black Pill, we have 6 options - two ads, and four recommended videos. The two ads are to Epoch Times and Judicial Watch. Epoch Times has recently been engaged in highly suspect and misleading behavior on Facebook, and Judicial Watch is a a far right source of conspiracy theories.

Spike 4 - YouTube video 1

The other video recommendations include:

  • 2 for Fox News
  • 1 for Black Pill
  • 1 for PragerU

The 4th most shared video - also to Black Pill - includes 8 links on the top screen; 7 videos and one ad.

Spike 4 - YouTube video number 4

The ad is for the National Republican Congressional Committee.

The video recommendations include:

  • 4 for Fox News
  • 1 for Fox Business
  • 1 for the Daily Signal
  • 1 for Huckabee

As with the other spikes, the top shared videos are right leaning to far right, and the recommended videos from YouTube are nearly all right leaning to far right.

H. Who Shows Up?

As noted in the summary of each spike, a small percentage of accounts creates an outsize percentage of the content. This isn't necessarily abnormal, but over time, noting what accounts show up most frequently can also help illustrate patterns. For each spike, I collected the accounts that were were in the 95th percentile or higher as measured by post count. Then, I looked at what accounts were in the 95th percentile of activity across all four spikes.

260 accounts total were active across all four spikes covered in this analysis. Out of these 260 accounts:

  • 246 accounts (94.6%) are right leaning to far right.
  • 10 accounts (3.9%) are mainstream to left leaning to far left.
  • 4 accounts (1.5%) were not clearly affiliated. These accounts were on a spectrum between overt gibberish and failed attempts at parody/joke accounts.

Coding of account leanings examined general traits of the accounts, including bios, recent posting histories, hashtags used, domains shared, and posts liked or retweeted. The following tweets provide samples from accounts that were coded as right leaning or far right:

Right wing Twitter example 2

Right wing Twitter example 1

Rightwing Twitter example 3

Two examples of left leaning accounts that were active across all four spikes are Ilhan Omar and new outlet The Hill.

Among the most active repeat participants, right and far right accounts vastly outnumbered mainstream and left leaning accounts. This analysis does not make any effort to determine whether or not these accounts are connected to real people, or whether or not these accounts are part of inorganic or inauthentic amplification as part of a larger network. While many of these accounts do show signs of being trolls and/or sockpuppets, more detailed analysis is required to determine potential authenticity or inauthenticity of individual accounts.

The overwhelming numbers of active participants from the right, relative to the much smaller number of participants from the mainstream and the left, indicates that on Twitter, right wing accounts show up more consistently. The fact that just under 95% of active repeat participants in these spikes are right leaning to far right, with just under 4% being left leaning or far left, helps highlight that in the conversations about Ilhan Omar, the right wing and far right voices are significantly more consistent and active than left leaning voices. This imbalance calls out for additional research on these accounts to determine how many can be connected to actual people, and how many are potentially sockpuppets working within a network.

I. Conclusion

When looking at what domains get shared, and at the most popular shares of YouTube videos, two facts become clear about the recent conversations about Ilhan Omar:

  • Content from right leaning to far right domains is shared at a much higher volume than mainstream, left leaning or far-left domains.
  • On YouTube, right leaning to far right content is initially amplified by disproportionate sharing from Twitter, and visitors to YouTube are subsequently served more right leaning to far right content via YouTube's content recommendation algorithm.

Individually, either of these elements indicate that right wing perspectives are overwhelming the conversation about an elected official. Taken together, however, these two factors are mutually supportive - this is how a closed system on seemingly "open" platforms take shape.

When the imbalance in domain shares, the imbalance in shares to YouTube videos, and the rabbit hole effect of YouTube's content recommendation algorithm are combined, we get a clearer sense of how social media platforms can potentially be gamed in parallel to fabricate consensus, and to support the spread of increasingly radical and hateful content. The imbalance in one conversation (or spike) both shifts the bounds of what's "normal" and then the next conversation shifts the norms even further.

This imbalance is further multiplied by the repeated rates of participation from right wing accounts. These accounts draw on older content generated from past flareups, creating a system that supports bias or misinformation in depth. Multiple content distribution strategies are at play here, and the whole is absolutely greater than the sum of its parts.

Over time, the right leaning to far right content creates an ever-growing foundation of sources that it can use to buttress arguments in future conversations. This ever-growing body of content provides a repository that reinforces a world view and a perspective. Conversations about specific issues become less about the individual issue, and more about proselytizing a world view and bringing people into the fold. To make a vast oversimplification, one of the possibilities suggested by this data set is that the left argues about specific points, while the right uses specific points to proselytize a world view.

While this analysis stays away from whether or not any of the activity is coordinated or inauthentic, this analysis highlights that conservative complaints of "censorship" on social media are somewhere between flimsy to baseless. The data set used in this analysis was derived from a search on a person's name. Theoretically, the results should have been a pretty balanced. If YouTube and Twitter are attempting to be biased against conservatives, they are very bad at it. Similarly, if they are attempting to check or curb the use of their platforms as a means of spreading misinformation and extreme speech, they're not doing great there either.

I have yet to see any platform provide concrete data around the numbers of FTEs (and I'm talking full, salaried employees, not contractors) with dedicated time and clearly defined authority to shut down hate speech and misinformation. I have also never seen comparisons of staffing levels between, for example, advertising, or sales, or marketing, and teams fighting misinformation and abuse. If and when platforms ever become transparent and show us this information, we could begin to get a more concrete sense of how they prioritize the health of their platform relative to other business interests.

On August 27th, YouTube released new guidelines and renewed promises to "RAISE UP authoritative voices" and "REDUCE the spread of content that brushes right up against our policy line." However, given what is readily apparent on their platforms, the visible results of the current efforts of Twitter and Youtube - as observed in this analysis - do not appear remotely effective.

J. Top 50 Domains Shared

Spike 1

Spike 2

Spike 3

Spike 4