Category Archives: Breach Analysis

A Breakdown and Analysis of the December, 2014 Sony Hack

Another incredibly far-reaching in-depth compromise of Sony Pictures has happened, this time by a group known as the Guardians of Peace (GOP). The new compromise has all of the excitement of the old events and more, as blaming North Korea for the attack in retaliation to a movie being released by Sony Pictures is all the rage. Risk Based Security has been keeping an updated timeline of the breach, analyzing the leaked documents, and providing links to additional information.

If you are looking for a comprehensive resource on the Sony Hack then please visit the following page:

Nothing is certain but death, taxes and identity theft.

As we are well into tax season, there has been a trend of articles in the news involving identity theft and tax fraud. Individuals are stealing information from various sources, which are not only businesses, but also straight out of mailboxes in order to commit identity theft and file false tax returns. Some of these criminals have been reported to net as much as $11 million with their schemes before being caught. 641,690 incidents had been identified by the IRS as of September 30, 2012.

Each of these incidents are a concern. However, all are not reported in DatalossDB as we require data loss incidents to have a steward organization. Therefore, we submit only to our database the schemes where personal data is stolen from an organization or business, but discard those where the data is stolen out of mailboxes as they don’t fit our requirements.

Here are some snippets of the latest cases we have seen in the news; these cases include both ones DatalossDB would and would not catalog. There seems to be a trend in state employees and tax preparers stealing information to file false tax returns themselves or to sell the personal information to others.

In one case in Alabama, a state employee obtained identification information from a state database from October 2009 until April 2012. That is two and a half years in which she went undetected while working with co-conspirators to file over 1,000 false tax returns and receiving fraudulent returns totaling $1.7 million.

In Los Angeles County, the Department of Public Social Services had an employee, who as a receptionist had access to the systems to input data and assistance requests. She took screenshots of 132 applicants’ PII (Personally Identifiable Information), and with the help of her husband and friends filed 65 tax returns in 2011 netting a total of $357,704.90 in fraudulent claims.

In Silver Spring, MD, two brothers running a tax service together stole identities from Puerto Rican residents to submit fraudulent claims through their business. They filed 13 false returns totalling $43,264.

Another tax preparer used information of previous clients and deceased persons in order to defraud the IRS and taxpayers for over $200,000 from 2003 to 2008.

The largest case we’ve seen, which is currently awaiting sentencing, took place in Fort Lauderdale and involved the filing of around 2,000 false tax returns from October 2010 until June 2012. This particular identity theft tax fraud scheme pulled in over $11 million.

To many, this might seem like a great way to make money. Here are some of the punishments that have or will befall these criminals. If convicted, the Alabama state employee is facing 20 years for each wire fraud count, 10 years for each computer fraud count, 10 years for conspiracy to file false claims, 2 years for aggravated identity thefts, fines, and mandatory restitution. The tax preparer, who used client and deceased persons information, was sentenced to 60 months in prison and paying full restitution amounting in excess of $200,000. As for the case where the scheme pulled in around $11 million, one of the women involved is looking at possibly being sentenced to 351 years. That is around 6 lifetimes of prison!

The IRS is taking action in response to the increase in tax related identity theft over the last few years. They have activated new identity theft filters, and are working with over 130 financial institutions to help identify identity theft schemes. The IRS has also trained over 35,000 people, who have direct contact with taxpayers, in ways to help identify red flags associated with identity theft, and they have doubled the employees in their tax related identity theft department.

Multiple resources including the IRS are recommending a few things to help keep your identity safer. Make sure that you do not carry your Social Security card around in your wallet or purse; if you do, take it out and place it somewhere secure. In fact, it is a good idea to take any documents containing personal information and secure them in your home. Many businesses ask for your Social Security number, even if it is not mandatory information. It is best to not automatically provide it every time you are asked. Never give out your personal information over the phone or email; the IRS does not contact taxpayers either way to acquire information. Monitoring your credit report on a regular basis can help to identify identity theft, hopefully before the loss becomes severe.

Written by eabsetz

Is A Data Breach A Life Or Death Situation?

Most people would agree that security is important; however, many would have a hard time saying that a data breach could be a life or death situation. Sadly, in the past few weeks there have been two cases that may qualify for that characterization in the news.

The first case is the data breach at King Edward VII Hospital on December 4, 2012. Two Australian radio show hosts prank called the hospital in a joking attempt to get information on the condition of the Duchess of Cambridge. To their surprise the nurse, who answered the phone, fell for the hoax and provided them with information on the Duchess’s condition and care. Last Friday, Jacintha Saldanha, the 46 year old nurse who provided the information, committed suicide just two days after news of the breach was released.

The second case involves a data breach that occurred September 28, 2012 at the University of Georgia. A former student gained unauthorized access to a server containing 8,500 former and current employees’ names, Social Security numbers, and other sensitive information. Still in the midst of investigation, police announced on Tuesday that Charles Stapler Stell, the 26 year old behind the data breach, passed away with no indication of foul play and most likely the result of suicide.

In these two cases, the data breaches and their consequences appeared to have pushed these individuals into a life or death decision. As the importance of privacy and security breaches increases, we have now seen there are potential ramifications to the people involved, more than just notification and credit monitoring.

As breaches unfortunately become more commonplace, organizations impacted should ensure that they not only have a response plan for dealing with the incident, but also how to constructively handle any employees at fault. While discipline from HR may be on the agenda, organizations need to ensure the wellbeing of their employees as they process their actions.


Written by eabsetz

Sony had HOW many breaches?

We thought keeping track of entities involved in the Epsilon breach was tough, but the recent spate of attacks on Sony networks has us working overtime trying to update the database. Thankfully, Jericho provided yeoman service and compiled a hyperlinked chronology of recent developments.

The Sony breaches have generated a lot of discussion. Some of it has centered on Sony’s shocking failure to encrypt passwords and it being all-too-vulnerable to SQLi compromises (if those posting the data publicly are accurate as to how they compromised certain databases). Sony undoubtedly has a lot of explaining to do if it hopes to have future assertions of industry-standard security taken seriously.

To date, the two largest incidents affected over 100 million records. But were the PSN and Sony Online Entertainment (SOE) attacks two separate incidents or were they really one breach? Should have recorded one breach with over 100 million affected, or two incidents involving 77 million and 24.6 million, respectively? Or should we just treat the last 45 days’ incidents as one #EPIC #FAIL and one big incident? In light of our mission to track unique breaches, the question is not trivial.

When news of the second incident broke, the first thought was to update the PSN entry and add another 24.6 million to that counter. But as more details emerged, it seemed clear that we should treat it as a separate incident. The attack had occurred on different days than the PSN attack, the data compromised were on different networks, it seems quite likely the different networks had different security measures involved (Sony later testified that databases with credit card data were treated with higher security), we did not know if the same individuals were involved in both attacks, and the company itself was reporting it as a second incident previously unknown to them and not as an update to the other breach. Our impression that these were two unique incidents was subsequently supported by the reports made to the New Hampshire Attorney General’s Office for each incident (here and here).

Despite what we thought was an accurate way to track these breaches, one commenter to questioned our decision to treat the reports as two unique incidents. A researcher with Javelin Strategy commented that treating this as two incidents instead of one benefited Sony: they would not appear ranked 2nd in our list of all-time largest breaches on our home page. Since these incidents had the same parent corporation, he suggested, they should be treated as one aggregated incident.

While those points may appear reasonable to some, we find them unpersuasive. First, we do not make decisions based on whether an entity benefits or suffers from a particular decision. We make decisions based on whether the available information supports aggregating the data for a particular incident or not. In this case, although it is the same parent corporation, the available information does not support aggregation. In other cases, such as a Wellpoint breach that was initially entered as distinct incidents, when my research revealed that there was only one incident and that what appeared to be a second incident was really due to Wellpoint’s vendor not fully securing the web sites after the first report, I recommended that those incidents be combined, and they will be. But other than a common target – Sony – where is there any evidence that this was just one incident? There is none.

We recognize that not everyone will agree with our decision, and that’s fine. Should new information become available that suggests that a one-incident approach is more appropriate for these incidents, we will edit our entries.

As always, we welcome constructive thoughts about how to make the database more useful to stakeholders, but we do not expect all of our decisions to please everyone.

/ Dissent

Epsilon Bingo

By now, everyone has probably read about a company named Epsilon. In fact, most people likely have second hand involvement, receiving one or more emails from companies you do business with warning you to be very careful after a recent incident. Most of these companies have used a similar form letter explaining the concerns and that you should be “cautious of phishing e-mails, where the sender tries to trick the recipient into disclosing confidential or personal information.” These notifications stem from Epsilon, a managed e-mail broadcasting company, getting compromised and having all of their customer e-mail addresses copied.

We have received a few emails from people asking us how we could have missed the Epsilon breach and why it isn’t on our site. Well, it actually is on the site as we do follow incidents such as this, however, it is listed as a Fringe incident. Why “Fringe”? From what we can tell so far, the breach (while unacceptable) is contained to Names and Email Addresses. We do recognize that this information may increase the risk to customers as targeted spearphishing attempts may be more successful, however, there is no loss of PII. We have debated this topic for years and instead of not including them in DataLossDB, they are now just labeled Fringe. There will be more debate on the severity of this incident for sure. Some think it is critical and others merely say that their email address was never meant to be private anyways. There are good arguments supporting both sides of the debate.

We will be continuing to add all of the affected organizations as we learn about them, and you can see the incident here:

When Epsilon posted the notice on their site they mentioned: “On March 30th, an incident was detected where a subset of Epsilon clients’ customer data were exposed by an unauthorized entry into Epsilon’s email system.”

As on April 4th, they have now have updated the definition of “subset” to mean “The affected clients are approximately 2 percent of total clients and are a subset of clients for which Epsilon provides email services.”

As of today, we are aware of a little over 40 companies affected and more notices are pouring in from users. As to how many users are impacted that is anyone’s guess. Our guess is A LOT.

If you want to read some of the notices we have received, over a dozen are on our mailing lists archives:

For those that want to play along, we have decided to make some Epsilon Bingo Cards. If you are able to fill up a whole card and prove it with the notices we might have to give you a prize… that is the least we could do, right?

As always, please keep sending us any notices that we are missing so that we may better gauge the scope of this incident and update the cards.

JCPenney has dodged a huge bullet… until now.

Now being reported in the mainstream media, JCPenney was “Company A” in the recently infamous Albert Gonzalez trial. In court filings, we found some attachments that seem to have been a convincing factor in the judges decision to unseal the identity of “Company A”, a.k.a JCPenney. JCP fought hard to keep its identity concealed, but ultimately it would seem that these attachments, as well as some reporting by Evan Schuman made the difference.

Attachment A, filed in document 14 of the case (for those following the case on PACER, etc.), shows ICQ chat extracts where Gonzalez and a co-conspirator discuss JCPenney. It is damning from a security professionals point of view. It would seem almost irrefutable that JCPenney was compromised. How many cards were stolen are unknown, but cards were almost undoubtedly stolen and JCPenney has (until now) seemingly dodged a huge public relations bullet. Below is a snippet from the attachment:

  • Gonzalez: “what did hacker 2 say about jcp?”
  • Conspirator: “he hacked 100+ sqls inside and stopped”
  • Gonzalez: “hacker 2 told me he found a place to snif for dumps in jcp”
  • Gonzalez: “i see, hacker 2 showed you anything?”

Gonzalez then posts what appears to be names and credit card details (redacted in the court docs). They then go on to talk about how one of the conspirators had “domain admin” access, suggesting that they pretty much had control of everything in the given network (depending on topology and segregation).

We struggled with a possible JCPenney incident before reading this document. We initially categorized it as “fringe”, but it seems pretty obvious at this point that JCPenney was either:

  1. 1) just hacked
  2. or

  3. 2) hacked badly enough to expose card data

But judge for yourself: here’s the attachment and the full pdf we obtained (including the attachment) for context. If you use these, please credit the Open Security Foundation for buying these and making them public — you don’t have to as they are public record, but we did have to pay for them, so we’d appreciate the credit!

Where did the breach go?

Where on earth did the breach go? We’ve asked ourselves, we’ve asked others, and we’ve been asked by many.

The simple answer is, we don’t know! It could be anything, really, that has caused the dramatic decline in reported data loss incidents in 2009. Here are a few ideas:

  • The decline is media related. Data breaches are ‘passé’.
  • Organizations are implementing better security.
  • Organizations aren’t reporting incidents.
  • Solar Flares

None of these, with the exception of solar flares, is likely to be analyzable at first glance. But what about the first bullet?

Due to a lack in expertise of space weather, we decided to dive into the Google News archives, and things became interesting. Google News’ timeline feature facilitates this kind of analysis. We looked through search result totals matching the query “data breach”, per month, for 72 months (2004 through 2009). We then tossed the data into a graph, added a polynomial trend-line with an order of 6, and took a deep breath.

Posted by d2d

When Reporters Go Looking For Data Breaches…

They often find them, and usually get a complimentary legal threat or outright lawsuit to go with it.

Recently, a Minnesota Public Radio reporter went digging, and indeed found records exposed. The records in question were I-9 processing forms held by Texas-based Lookout Services. The undisputed truth seems to end about there. The reporter wrote about the incident, and the attention the incident stirred caused the entire state of Minnesota to stop using Lookout Services for I-9 verification. Lookout Services responded with a lawsuit, essentially claiming that MPR illegally accessed the data.

Now, MPR claims it didn’t need to authenticate in order to access the data. Lookout Services supposedly disagrees, according to a great article by reporter David Brauer, which gives excellent background into the issue. My interpretation is this: an authenticated connection was used to find a URL that granted access to data without authentication. For instance, most modern web applications determine if a login session is established on every request to the website. It is possible to ‘omit’ or ‘forget’ to check on certain requests, and just hand the content over. If I had to bet, I’d bet that the reporter found such an omission, then ran with it.

Was it illegal to do so, if that indeed is what happened? Maybe! Which is why reporters should really tread carefully when trying to ‘create news’.

This isn’t a recent phenomenon either.

In October of 2008, a WHRW campus news reporter, while reportedly walking through campus, stumbled upon an unlocked room with the door taped open. The room reportedly contained thousands of student records. The reporter announced the “breach” on the radio and blogged about it. The University then began a criminal investigation, which involved the district attorney, per this reporting:

The room was apparently several feet off the ground on a maintenance catwalk, and the reporter didn’t simply ‘stumble upon’ it. Was it a ‘good thing’ that this data storage issue was brought to light? Maybe, but was it done properly? We didn’t even add this incident to the database, although the option to do so isn’t exactly off the table.

Sometimes these things go without bickering between the reporter and the breached entity, but usually only when the breach was wide open and it is clear that the reporter did indeed just happen to stumble upon it.

Searching for data breaches isn’t something we at OSF have ever really condoned. It falls in a grey area for certain. Even using search engines to find the data, such as utilizing public resources to find publicly accessible data, is somewhat questionable as the act itself mirrors what an ID thief would do.

Legal Sub-Project – Elvey v. TD Ameritrade

The TD Ameritrade incident of 2007 hasn’t quite been resolved — yet. While the breach may have been contained, the litigation is still ongoing. A class action suit field in California in May of 2007 has reached a preliminary settlement, but the settlement is contested by the individual who filed the class in the first place and has been through some extremely interesting twists and turns.

The case was filed in May of 2007, with a complaint that claimed that TD Ameritrade was essentially selling email addresses of clients to spammers, in violation of TD Ameritrade’s privacy policies and various laws.

A motion for a preliminary injunction kicked things into gear in July 2007, which alleged that the spam was still ongoing, and demanded that TD Ameritrade take steps to protect members of the class (TD Ameritrade customers). The fact that the incident was still ongoing at the time of the injunction was later confirmed in testimony, and it would seem from interpreting the various testimonies in the case that the breach was mitigated “on or about August 14th, 2007”.

Sometime thereafter, TD Ameritrade acknowledged that it had in fact been “hacked”, and that the hacker had access to names and email addresses. During the disclosure (via a letter to customers), TD Ameritrade also acknowledged that the database that had been breached also contained Social Security numbers, but that TD Ameritrade had no evidence that Social Security numbers had been taken. This spawned another lawsuit: Brad Zigler v. TD Ameritrade. The complaint in this new lawsuit went beyond the spam aspect, and brought into view the potential compromise of Social Security numbers as well. In December of 2007, the two cases became officially related.

In early 2008, a new judge was assigned to the case. Several months later, the two cases merged, and a request to have a settlement approved was filed by the plaintiffs (on May 30, 2008). Both sides seemed in agreement at the time. Days later, at a proceeding, that agreement appeared to have dissolved. One of the class representatives, Matthew Elvey, the individual who had originally filed the case in May 2007, opposed the settlement — even though he had signed it days prior. Mr. Elvey stated that he had been threatened, which is why he agreed to sign the settlement. His opposition claimed that the settlement was not fair, that he had been an identity theft victim as a result of the TD Ameritrade breach, and that some of the reasoning behind the decision to settle was flawed. During the same court hearing, one of the most significantly discussed “reasons” for settling was the results of an “organized misuse” analysis, which was done by a third party organization, ID Analytics. This reason was particularly opposed by Mr. Elvey.

Now, before we dig into “organized misuse”, we should first look at how one might assume a traditional investigation into a data breach would proceed. One would suppose that both during and after a breach, an organization experiencing the breach would first try to stop and contain it, try to assess what exactly occurred, and then understand what was accessible, accessed, potentially lost, and confirmed as lost. In containing the breach, one might assume an organization would act swiftly, yet carefully. In assessing the scope, one might think an organization would look to internal security systems to make determinations — networklogs, system logs, audit logs, and transaction logs. An organization might also contract with a firm with forensic expertise to assist in making determinations and provide further analysis. Supposedly, this sort of analysis did occur. The “security officer” responsible at TD Ameritrade, Willliam Edwards, gave a deposition regarding the details of the breach, which became sealed for “attorney’s eyes only”. We can’t conclude much at all from this, however. But back to the hypothetical, what if the aforementioned “expected” protocol didn’t provide sufficient information, or perhaps didn’t provide “ideal” conclusions? More alternatively, what if those conclusions did not give the organization the answer it wanted to hear?

Fortunately, there’s another option: a now nearly court-proven way to gain intelligence into the matter… in comes an “organized misuse” analysis. Companies, with what appears to be access to and/or partnerships with credit bureaus, can run some form of pattern analysis to determine whether or not identity theft is linked with a given organization, population, or sample. Presumably, they analyze occurrences of ID thefts in a sample, and determine whether or not the samples show a higher occurrence of ID theft than a baseline sample/population (no doubt via some fancy math and other complicated stuff.)

Where this all gets interesting is that when the TD Ameritrade incident was originally disclosed, there was no mention of Social Security numbers being affected. OSF did not include it as a data type, nor did we find any indication in any reports regarding the incident that they had been included. In the process of fighting this class action suit, however, TD Ameritrade used an outside firm to run this “organized misuse” analysis, which came back as “negative”. TD Ameritrade could have simply said that Social Security numbers were not accessible, but they didn’t, which would imply that they were indeed accessible to the intruders. Nowhere in any of the documents we reviewed did we find any denial of this, and in fact, in many instances they confirmed that “Social Security numbers were in the database”.

That statement is very different from TD Ameritrade *outright* saying that Social Security numbers were accessible. It could have been that the nature of the compromise exposed a database view, and that Social Security numbers were not accessible to that view. Had that been the case, saying that they were not accessible seems like a stronger defense than going through an expensive “organized misuse” analysis process. It would seem evident that proving there were logical or physical gates in place that separated the data, and thus made it inaccessible, would have been a less expensive and more convincing an argument to make, but no actual attempt was made to refute accessibility. From that, it does not seem a far stretch to assume that the numbers were accessible.

Even still, it seems that relying on “organized misuse” analytics as some sort of “proof” that a breach of Social Security numbers did not occur is a bit curious, and also possibly a logical fallacy. For one, it would only be reliable at the point in time when it was concluded, and actually might only be representative of a point in time months prior given the delay with which credit data is populated. It could never definitively conclude that a breach of identities did not occur, given that there could simply be the case that the stolen identities hadn’t been sold or otherwise abused at the time of the analysis. Given the permanent nature of identities and specifically, Social Security numbers, it also does not seem implausible that an identity thief might “hold on” to their find for some duration prior to capitalizing on it as a way of “laundering” the identities. Granted, this is speculative, but so is the presumption that since no evidence of “organized misuse” exists, Social Security numbers had not been compromised.

Regardless, the settlement would have essentially consisted of the following:

    • TD Ameritrade would post notices 4 times in the year, for 1 week each, regarding the incident.
    • Members of the class would get a free 1 year subscription for Trend Micro Internet Security Pro (retail value $69.96). The software was to address the spam that came as a result of the disclosure of TD Ameritrade customers’ email addresses.
    • TD Ameritrade would commit to twice yearly external penetration testing.
    • TD Ameritrade would perform account seeding to detect compromise of email accounts.
    • Class members would give up their right to form another class action lawsuit, but could pursue TD Ameritrade as individuals if identity theft did occur as a result of the breach.
    • TD Ameritrade would donate $20,000 to the Honeynet Project, and $35,000 to the National Cyber Forensics and Training Alliance.
    • TD Ameritrade would cover all legal expenses of the case incurred by the class.
    • A settlement notice would be posted in USA Today.

Elvey retained additional counsel to oppose the settlement that he and his original counsel had signed. Over the course of several months, and several court appearances, the plaintiff and the defendant seemed to “buddy up” to some degree, while Elvey continued to oppose with his new representation. Elvey had all but seemed discredited when, in late 2008, the Texas Attorney General jumped in on behalf of a stated near half-million Texans represented in the class. The Texas AG had the following to say (as summarized by the judge):

      • the proposed settlement agreement offered “no meaningful relief to the class members”;
      • the award of proposed fees to class counsel was excessive;
      • the proposed settlement failed to address the harm of identity theft adequately;
      • the proposed release was too broad;
      • The Texas Attorney General contended that the settlement was essentially worthless because the “warning” to be placed on the TD Ameritrade website would largely go unseen by consumers most vulnerable to stock spam;
      • the security measures TD Ameritrade agreed to conduct should have been conducted by “any reputable company” anyway;
      • the coupon for security software was of little value because similar software was largely available to most Internet users for free or at low cost;
      • the Texas Attorney General noted that the class members were to receive no monetary recovery while the proposed attorney fee award for class counsel was substantial —— $1.87 million;
      • the proposed settlement agreement did not address adequately the potential harm to class members from identity theft;
      • the Texas Attorney General further argued that the settlement agreement should make clear that the individuals who engaged in the unauthorized access are not “Released Parties” and “Releasing Parties” should be amended to make clear that government entities such as the Texas Attorney General has not released any claims to relief related to this security breach;

These oppositions were strong, and spun off months of additional negotiations between the plaintiff, the defendant, and the Texas AG’s office. The revamped settlement, which won the approval of the Texas AG, was a slightly improved version. It emphasized somewhat more the risk of ID theft from the breach, and also removed or revamped some of the limits that class members would have had imposed on them for additional suits, but substantively didn’t really alter much.

What it did change was that it created a new argument for the defendant and the plaintiff: “The Texas AG signed off…”, which sealed the deal and seemed to outweigh any opposition to the settlement by Mr. Elvey. The revised settlement was “preliminarily approved”, on May 1st, 2009, bringing the class action suit a big leap forward towards conclusion.

In all, this is a fascinating case, which begs several questions: why is this “organized misuse” so convincing? What is so confidential about the deposition given by Mr. Edwards? It was sealed for several reasons, some of which seem a little far fetched. One was that it might expose the class to the risk of identity theft, and that was vaguely related to the fear that such information, if made public, would somehow entice or encourage hackers to go after TD Ameritrade. This doesn’t seem all that realistic. The firm has a million reasons to be concerned about security, but, other aspects of the case suggest that this “concern” is a recent phenomenon at TD Ameritrade, for instance: How exactly is a commitment to perform “twice yearly independent vulnerability scans” a benefit to the class? Is TD Ameritrade not already required by industry standards like PCI, or better yet, its own internal security policies to do so? Was this not point 6 of the Texas AG’s argument? And why did the Texas AG back down on several points?

And those are just the questions on one side of the coin. Why did Elvey approve the settlement in the first place? The “threats” claimed could use some additional scrutiny. Had he not signed the settlement, would things have gone much differently? Did Elvey’s “late game” claims of identity theft help or hurt his case?

We don’t yet have all the final numbers on this, as the case is still ongoing, but when we do we’ll update the incident with the final costs associated with this class action suit. The costs will be of some substance, but from the looks of it, a very small amount per record breached. We are updating the data types to include Social Security numbers, partially because of a recent article in the media on the topic, and partially due to the information gathered from the court documents. All the documents we’ve collected regarding this case are available for your perusal here.

We believe gaining legal insight and costs associated with data loss incidents are key indicators to help fully understand the true impacts. We are in the process of starting a new legal sub-project that will be tightly integrated into DataLossDB. The project will focus on collecting information on lawsuits associated with data loss incidents. The goal is to be able to provide more depth to the data, give us some editorial fodder, and most importantly, to get some empirical data on the legal costs of a data loss incident. If you are interested in helping to lead, shape, and ultimately maintain this project please contact

Walmart, Primary Sources, Left Field

We knew when we started the Primary Sources Archive that we’d find some interesting incidents. There was little doubt that what we were seeing reported in the media was a fraction of what was really going on, and we continue to feel that even what we find in the media, and primary sources, still represents a fraction of what really goes on. We did not exactly anticipate finding enormous un-reported breaches via primary sources, however.

We recently launched a small initiative to get primary sources via volunteer contributors from across the 50 states. One volunteer recently submitted a batch of files to us, obtained through a FOIA equivalent request to the state of Illinois. Of those, most were incidents we already knew about, with some exceptions, and one rather large exception.

It would seem that Walmart experienced a significant breach in mid-2007 that we had never heard of in the media. A former employee left Walmart with personnel data of over 48,000 Walmart associates residing in the state of Illinois. That is an enormous number of records for just one state.

In reading the language of the document obtained, it would seem that the breach wasn’t exclusively affecting residents of Illinois, leading us to ask, who else was affected, and why haven’t we seen this elsewhere? If we make the assumption that the breach was nationwide, then it may have affected over a million people. Considering Walmart employs 1.8 million people, the numbers aren’t terribly off.

  • Number Affected in Illinois * Population of the USA / Population of Illinois = Number Affected in USA
  • 48,000 * 300,000,000 / 12,852,548 = 1,120,400

That assumes that the breach isn’t localized, and that population is a reliable metric for measuring data loss incidents, neither of which is known. Regardless, this is a significant breach, and we never heard of it until now.

We have several FOIA equivalent requests out for data during that timeframe which may shed more light on the incident, but we found it interesting enough to post now.

As an aside, and while on the topic of older breaches, the Oldest Data Loss Incidents contest is still underway. We have great prizes available, so be sure to compete!

Posted by d2d