Long Form Writing: Big Data Surveillance

Imagine your every move, on and offline being watched and recorded at all times, leaving a digital trace. Every click, like, favorite, retweet, we do online, and everything we do offline, like a credit card purchase we make, all are contributing factors to leaving a digital trace or digital shadow and collected into a large data set. This is what people like to refer to as “Big Data”; being able to dig up data from multiple sources, analyzing the data collected and sometimes make decisions based off that data. Big data has been around for quite some time now, which originally started off when businesses collected data on their clients to see what makes them agitated. But one day, someone discovered that this big data system can be used in law enforcement investigations in many different ways.


Michal Kosinski, a student attending Cambridge University for his PhD in Psychometrics, met and teamed up with fellow student David Stilwell, who had just created and launched his own Facebook application called the MyPersonality App. From this app, users fill out a psychological questionnaire, and based off of their answers to this questionnaire, users were given a “personality profile”. Kosinski took the participants answers from their questionnaires and compared it to their online information left behind such as: the posted they liked, shared posts, or their specified information on Facebook such as: where they live, gender, age, and marital status. While this information on its own may not be a reliable source of information, when thousands of posts or “individual data sources” are reinforced and recorded into a system, it is easier to make pretty accurate predictions.


After years of revision, Kosinski and his team were able to put the finishing touches on their app in 2012, where they were able to make even more specific predictions than ever. They announced that after their revisions, they were able to tell just based off of 68 “likes” on Facebook, a persons skin color, ones sexual orientation, religious affiliation, and their political status. All of this predicted information is surprisingly anywhere from 80-100% accurate. Interestingly enough, this personal data that has been collected can actually be used the other way around, to search for other personal profiles.

Social media as a whole has changed how law enforcement not only solves and prevents crime. “There’s a growing number of police departments who are making social media part of intelligence processing,” says Mike King, a former police officer who works in the law enforcement division of the Geographic Information Systems (GIS) developer. King continues to say that there is  “a weird psychology involved with people who are deeply embedded in social media.” Although mainly relying on social media to crack a case can be difficult because it is dependent on so many different factors, one of them being if someones location is turned on, but on the other hand, in some cases where social media was heavily used, they found perpetrators actually broadcasting and announcing the crimes they have just committed.

The idea of using big data in law enforcement cases is something that has been around for two decades now, and continues to get more powerful everyday. Today, most law enforcement agencies  have adopted this method, and are using whatever available data to prevent crime and in some cases, even catch criminals and put them behind bars. In fact a number of cities in the United States are beginning to use what they refer to it as “predictive data” aka big data on a lot of their cases to further investigations, across a number of different platforms such as the cloud, all different kinds of social media, emails, and even our personal cellphones.

In Durham, North Carolina where the crime rate is pretty high with about 13, 357 violent (recorded) crimes per year, their police department was able to cut their crime rate in half by using “predictive data” to analyze relationships. The relationships that are analyzed are the ones between places, people, and other personal data, which ultimately makes the police department run more efficiently.

In Los Angeles, the LAPD are using big data systems to collect data from old cases such as: rap-sheets, case management tools, DMV records, license plates, and so forth. “For years we’ve had stovepipe systems that have a lot of information but don’t talk to each other and don’t compare that information,” says LAPD Chief Charlie Beck. Since the LAPD has implemented the use of big data into their investigations, they have also seen a drop in their crime rate as well.

The Chicago police department is also a big supporter of using predictive data to solve cases. In fact, they have implemented a state-of-the-art big data system. Their system has what they call a “heat-list” , which is basically a list of about 400 people who are most likely going to be involved in crime. When someone is added to the CPDs heat-list, they are notified immediately, along with the legal consequences that could come along with the crimes they may or may not commit as a part of the police departments Violence Reduction Strategy (VRS) which was established in July 2013. The CPD big data system also uses graphic and predictive algorithms. The reason being is that majority of the crimes being committed such as: gang violence, drugs, and gun violence all travel through or occur in their neighborhoods and local cities.


Lastly, the police department in Ft. Lauderdale was chosen to have a First of a Kind research development program with IBM. The purpose of this project is to analyze crime-related data sources using a combination of advanced technologies. During this project, IBM analyzed all kinds of data from 911 calls to criminal records. While analyzing this range of data, they are mainly focused on looking for patterns in location, time, unlike the MyPersonality app, which determines who will be likely to commit a crime rather than where and how its being committed. According to the company International Business Machine (IBM), the research program will allow for a “deeper level of knowledge regarding possible contributing factors of crimes and foresee the demand for service at a more granular level of time and location…”,  which ultimately leads to enabling the city to “to move operations from reactive to proactive, leading to a safer city.”

Now, is this a good or bad thing? Well I guess it depends if you like being watched or surveilled 24/7 and put into a data base. Personally, I don’t feel comfortable knowing every move I make on and offline is being recorded and stored into a bigger data system. Although, this is just the beginning for predictive data in law enforcement agencies and technology will keep advancing, police swear they are not interested in violating our privacy. “We’re going to continue to test boundaries and I believe that’s what makes the Untied States such a great country,” says King.

There is still a long way to go when it comes to implementing predictive or big data into law enforcement agencies. Anybody can form a bias opinion off of information that has been collected based off of a persons personal information without actually meeting them or knowing anything else about them at all. On the other hand, I also have seen people post pictures and announce the crimes they have just committed on social media as well. If people are bold enough post their involvement in illegal actives, than the use of big data in that way is completely fine.

Is using big data  the best decision? In todays day and age and where we stand with racism as a country, I am not sure its the best decision. Police officers already have a bias opinion in their head about kids and young adults of this generation, so will giving the cops more information that people didn’t know was being recorded and used against them, actually help fight crime? I am a realist, and realistically, id like to see how this plays out, because all you hear today about social media and cops, is that cops are intentionally killing people of color just based off of their personal feelings towards them. To me, this seems like adding fuel to the fire in some cases, but could also be beneficial if used in the correct manner.


Week 14 Blog Post

The presentation I chose to respond to this week was done on Monday, about the gender roles in social media, more specifically the group looked at instagram. Whats stuck with me about the presentation was when they compared the famous woman models instagram page, to a guys instagram page, the content and amount of followers were significantly different. The girls page had her modeling in her bikini in some places, her looking really beautiful with a lot of makeup on. Where as the guys page had no photos of himself really, but of guy things like cars and landscapes. They concluded to say that women are often target more attack sexually, online, for many reasons. Also that women will be judged more sexually based on what they post rather than guys, and people shouldn’t be stereotyped by gender or race.

DTC 365 Final Project Topic

For the final project I will be doing my investigative writing on online data collection. The reason I chose this topic was because I want to learn more about how our personal data is being collected from online sites.


Annotated Bib:

  1. Neal W. Topp https://ntserver1.wsulibs.wsu.edu:4530/article/10.1023%2FA%3A1014669514367
    1. This article offers a historical background on how online data collection first began.
  2. https://www.theatlantic.com/business/archive/2015/09/discrimination-algorithms-disparate-impact/403969/
    1. This article is a good example of data collection.  Also, once that data is collected, what can be done with that information and how it can go wrong when decisions are made based only off of the info collected.

When Discrimination Is Baked Into Algorithms: Nikki Aviles

After reading this article, Kirchner explains the legal repercussions as discrimination. Discrimination in a work place, school, against ones gender, race, etc. The information gathered by data mining and algorithms is sensitive, personal information that has nothing to do with making a decision on if someone is job qualified, or the test scores they have. No one should have access to this information.

The laws today do not adequately cover the new phenomenon of “big data” bias because, If someone is able determine a persons credit score, qualifications to be hired, or the length of their prison sentence, all from data mining/algorithms, then we are not being protected enough.  “Even in situations where data miners are extremely careful, they can still affect discriminatory results with models that, quite unintentionally, pick out proxy variables for protected classes.”

Digital Gender Divide

For this weeks blog post we looked at a study done by Martin Hilbert, which looked at how both male and females use technology in developing countries. Hilbert starts off with a little background information on how technology was seen as a male thing, and women weren’t as tech savvy because they always second guess themselves and “have low self-esteem”. He later explains how women began to catch up with men, and how ITCs can actually be an empowering tool for women not only for business purposes, but to over come discrimination against women as well.

My hypothesis/position on this.. ? Well reading this I was actually a little shocked. In todays day and age, I feel like its the complete opposite. I feel like women are the more superior beings when it comes to technology. The main reason I think this is because of the explosion of social media. Women are social beings, more so than most men, so when we found out we can communicate with our friends online.. it was a game changer. Then instagram came with showing off your photography skills. Personally, seeing how social media works and online data, is one of the reasons why I became a DTC major. This article was a little hard to read with out any bias or having my own strong opinion, but this was an eye opener.

Week 6 Blog Post

For this weeks reading, we took a look at Manuel Castell’s work on the Egyptian Revolution. In his research he talks about what first sparked the revolution and the support that made the revolution blow up. A big supporter of this revolution was the media, more specifically a video that someone posted on YouTube. This video exploded,and was eventually named “The Video that Helped Spark the Revolution”. This was the most effective way to spread awareness on what was going on in Egypt because of how fast and easy information was able to spread. People were posting on multiple platforms mostly trying to expose the oppression going on  during the Hosni Mubarak presidency.

One conclusion that we can draw from this reading is that social media was the main reason the Egyptian Revolution blew up. People would record and post events that happened during the revolution to YouTube and Facebook, sometimes even live stream them. The people of Egypt even coordinated events through Twitter, and used blogs to express how they felt about what was going on and to spread awareness. Although communication before social media was helpful in this Revolution as well, social media had and unimaginable impact on the revolution.

Laurie Frick

With her data project, Laurie Frick wants to move away from the basic algorithms and patterned ways of collecting data. Frick wants to have a more personalized approach, so that one day when you look at someones personal data, its not just data, but a “glimpse into their personality”. I took a look at another one of her projects called Stress Inventory. This is an exact example of collecting data, but in the most personal way possible. In this daily chart there are a number of different color dots that indicate different indicators of stress, depending on how much you encountered that day is how many you would stack on for that day. This design is not only really aesthetically pleasing, but is an exact embodiment of collecting data, without the algorithms and actually being able to see someones personality come through.

Week 5 Blog Post

Lisa Nakamura and Peter Chow-white speak about matrix and data mining in chapter 6 of Race After the Internet. Both categories of classification: matrix and data mining, have to do with identification and classification online. These two categories gather information based off of your personal online profiles. Although, it can be hard to gather personal information on someone because most people have more than one personal interest and post about it. After analysts have gathered enough information on you, they spit that information back out to you in the form of adds. These adds pop-up usually in the corners of popular websites. Most likely everyone who has a social media profile is being surveilled.

One assumption I feel here is that mostly everyone today is on social media, but don’t know whats going on behind closed doors, or the screen i should say. Not most people know that we are being watched, and we are being tracked. I had no idea the kind of information you can dig up from just one social media post, or just by taking a picture; you can find out where you were at the time, what time the photo was taken, what day. There are so many things that we don’t know where posting, when we post. Thats a scary ting to me. Are we really in the land of the “free”?


According to WhatIs.com Infographics or Information Graphics, are defined as a “epresentation of information in a graphic format designed to make the data easily understandable at a glance”. The graphic that caught my eye the most off of Ross Hudgens post of The 100 Best Infographics,  was the “Common Mythconceptions” graphic. screen-shot-2017-02-09-at-6-18-47-pmThe reason I chose this graphic because right off the bat, this post has an interesting title; thats what got me to stop scrolling and give this graphic a look. The interesting content is actually what made this post go viral. This interactive graphic is nicely organized; separated into 4 different categories or myths. The different color logos next to them also adds an intriguing visual aspect.  This graphic has been awarded with one of the most icons on the website: the Great Idea, Unique Design, Covered by 50 + outlets, and the Interactive icons.

Week 4 Blog Post

After reading this article my outlook on the internet has been altered a little bit. This weeks reading was about “Content Moderators”, and the content being posted on the internet. According to The Laborers Who Keep Dick Pics and Beheadings Out of Your Facebook Feed , content moderation is the act of removing offensive material off the web. It never even crossed my mind that content moderation is actually a job, I thought that whatever you post, automatically ends up on whatever website you wanted. I had no idea there were people sitting in an office monitoring what we post.

The article talks about content being posted on one specific website, Whisper. Whisper is a LA-based mobile startup, that lets users anonymously posting photos and their “secrets”. According to Micheal Hayward CEO of Whisper, they practice “active moderation” to try and keep their “toothpaste in the tube”. They are constantly bombarded with hundreds of posts at a time, they need to decide on the spot what is appropriate and what isn’t. Having a job as a content moderator has a downside. Seeing content like pornography, beheadings, and sexual solicitation all day can drive a person crazy. Some people who have the job end up quitting. Twenty 24 at the time, Jake Swearingen\ can recall the first time he saw a beheading on the job and at that moment, everything changed for him. He said he did not want to become a “connoisseur of beheading videos”. He eventually ended up quitting and is now the social media editor at Atlantic Media.

The main assumption/argument that this article poses is that the need for content moderators is beyond necessary. Although this job is a lot to handle at times, the Internet would not be the same without it.