Archive for the 'data mining' Category


Also see the data mining category on the Blindside Wiki

Is This Good or Bad News?

Posted by Tom Fuller in Blindside project, Humanity nature and activity, data mining, fraud at November 30th, 2007

How easy would it be to find this information for UK and continental Europe?

An estimated 8.3 million Americans over the age of 18 were victims of identity theft in 2005, according to an analysis of a phone survey released Tuesday by the FTC. That represented a decline of about 16 percent from an estimated 9.9 million victims in 2003, when the agency last conducted its survey.”

“Identity theft cost U.S. businesses $55.7 billion in 2006, according to Javelin Strategy & Research. The FTC estimates that in 2006 the cost to consumers was $1.2 billion.

But experts say complaints filed with the FTC offer only a glimpse of the actual damage. “Most people don’t even think about calling the government because they are not going to help them get their money back,” Litan said.

The FTC estimates that 1.8 million Americans discovered some type of fraud committed using their personal information, 3.2 million had their credit card accounts misused and 3.3 million experienced misuse of other financial accounts.

Javelin’s estimates back the FTC’s findings. It said 8.4 million people were victims of identity theft in 2007, down from 8.9 million in 2006 and 9.3 million in 2005.”

Impacts of Hacked Information

Posted by Tom Fuller in Blindside project, Data breaches, IT failures, data mining, databases, fraud, human error at November 8th, 2007

Via Kable: “The Land Registry has pulled potentially sensitive documents from its online service. As from midnight on 5 November 2007, online access to documents such as mortgage deeds and leases will be removed. Members of the public wishing to inspect or have copies of any such documents can do so by applying in writing to Land Registry. The move followed a report in The Daily Mail that criminal gangs have stolen £12m over the past two years by exploiting loopholes in the website. They gained access to documents such as title deeds to make it possible to sell properties they did not own.”

It’s a pity legitimate users of Land Registry information will no longer have access to these details, I guess, but what were sensitive documents like these doing lying around in the open air in the first place? Did any review of this take place?

After the fact, the Land Registry tried to ‘put this in perspective,’ saying that the £12 million in fraud was a small percentage of the fee income it generated.

WAKE UP. The £12 million in fraud in all probability represented a very large percentage of the total wealth of the individuals who were defrauded, each of whom had to go through a long and laborious compensation exercise and probably had to get the services of a solicitor to help them. Of course it had minimal impact on the Land Registry. It’s not their money. It’s not their information. It’s not their privacy.

Web 2.0 and Information Assurance

Posted by Tom Fuller in Blindside project, data mining, databases at November 2nd, 2007

It doesn’t seem as if anybody has noticed yet, but what Web 2.0 (really, it’s Web 1.05, in my opinion) is all about is databases accessible from the Internet.

A weblog such as this is a database with scripting that generates a valid URL and a time/date stamp when a field is entered. A wiki is the same thing, but instead of publishing the time and date stamp, it allows rewriting. Mashing up of data is just porting data from two different databases to a third location and doing useful work on the data at its new home.

All of the recognisable Web 2.0 success stories are variations on the theme: MySpace and Facebook, blog farms. Flickr and YouTube, modified blog farms–databases all.

I won’t really start thinking of Web 2.0 as Web 2.0 until it does what it says on the tin by incorporating off-net data into their offerings, and start sending fused data streams outwards, both on and offline. When SMS and Skype seemlessly integrate into a web offering, we’ve got something. When UpMyStreet automatically texts me to stay out of this neighbourhood because of night-time crime statistics, then we’re onto something. Similarly, when I receive an SMS in a bar telling me that someone with my Facebook profile is in the same bar and is available for conversation, then Web 2.0 is here. Because for me, Web 2.0 is all about moving information off the Internet and into the real world and vice versa.

Sadly, the information assurance issues regarding databases are significant and so far less than amenable to easy solution. Databases will take notes of changes made to them, but unless the data is archived before the change is accepted, those notes are only useful in assigning responsibility for errors and crime. Archiving large scale databases prior to accepting any change would be a bit impractical.

If anybody can talk us all through a practical guide to effective information assurance for databases, the comments field is all yours…. here’s hoping.

Breaking the System to Save It?

Posted by Tom Fuller in AnonymitY, Blindside project, data mining at October 30th, 2007

The Internet database WhoIs may be marked for destruction, if some privacy advocates have their way. The database is regularly used by law enforcement officials and contains contact information of website owners.

The story is covered in some detail here.

“What removing the status quo will do is force all of the actors to come together without the benefit of a status quo to fall back on and say, `We are now all screwed. What will we do?’” Rader (Ross Rader, a member of ICANN’s generic name council) said. “It will lead to better good-faith negotiations.”

The issues are quite important–the database has clear value, but the potential for abuse is quite high. Because of ham-handed law enforcement and anti-terrorist measures in the recent past (mostly in the U.S. and U.K.), a significant percentage of stakeholders are willing to give up the database to prevent abuse.

“Law-enforcement officials and Internet service providers use it to fight fraud and hacking. Lawyers depend on it to chase trademark and copyright violators. Journalists rely on it to reach Web site owners. And spammers mine it to send junk mailings for Web site hosting and other services.

Internet users, meanwhile, have come to expect more privacy and even anonymity. The requirements for domain name owners to provide such details also contradict some European privacy laws that are stricter than those in the United States.

There’s agreement that more could be done to improve the accuracy of Whois, as scammers and even legitimate individuals who want to remain anonymous can easily enter fake data.

The disagreements are over “who gets to see it (and) how can we protect people’s privacy while at the same time making accurate information available to those who need it,” said Vint Cerf, ICANN’s chairman.”

The lesson to be learned is to take privacy seriously and don’t sacrifice your long term credibility for short term information gains. But there is no evidence that that lesson will in fact be learned.

Bruce Schneier’s Cryptogram

Posted by Tom Fuller in Blindside project, Cyberwar, Data breaches, data mining, databases, e-ID at October 15th, 2007

I suppose I should pretend I did all the research that produces the following, but I just opened the email from Bruce Schneier’s Cryptogram. If you’re serious about these issues (and why else would you be reading this?), click here to subscribe.

Quotes from this issue:

“Although it’s most commonly called a worm, Storm is really more: a worm,
a Trojan horse and a bot all rolled into one. It’s also the most
successful example we have of a new breed of worm, and I’ve seen
estimates that between 1 million and 50 million computers have been
infected worldwide.”

UK Police Can Now Demand Encryption Keys: “Cambridge University security expert Richard Clayton said in May of
2006 that such laws would only encourage businesses to house their
cryptography operations out of the reach of UK investigators,
potentially harming the country’s economy. ‘The controversy here [lies
in] seizing keys, not in forcing people to decrypt. The power to seize
encryption keys is spooking big business, ‘ Clayton said.

“‘The notion that international bankers would be wary of bringing master
keys into UK if they could be seized as part of legitimate police
operations, or by a corrupt chief constable, has quite a lot of
traction,’ he added. ‘With the appropriate paperwork, keys can be
seized. If you’re an international banker you’ll plonk your headquarters
in Zurich.’”

“Microsoft updates both XP and Vista without user permission or
notification. Microsoft can do this; that’s just stupid company stuff.
But what’s to stop anyone else from using Microsoft’s stealth remote
install capability to put anything onto anyone’s computer? How long
before some smart hacker exploits this, and then writes a program that
will allow all the dumb hackers to do it? ”

London’s 10,000 security cameras don’t reduce crime:
http://www.thisislondon.co.uk/news/article-23412867-details/Tens+of+thousands+of+CCTV+cameras%2C+yet+80%25+of+crime+unsolved/article.do
or http://tinyurl.com/286pab
This is a follow-up to a 2005 article:
http://www.thisislondon.co.uk/news/article-16856213-details/CCTV+’does+not+stop+crime’/article.do
or http://tinyurl.com/2tfjyf

Just go and subscribe, or read them on his weblog.

Womb to Tomb Identity Control

The General Register Office, which oversees the registration of births and deaths, is to become part of the Identity and Passport Service in a move that is likely to see sharply increased data sharing between the two bodies.”

This is, or should be, the story of the week.

The government plans to give IPS staff online access to births and deaths information which could be cross checked with ID card or passport applications. Data sharing between the two bodies was given a legal basis in July by an order made under section 38 of the Identity Cards Act.”

In the story linked to above, Phil Booth of No2ID makes the badly needed points, and I doubt if he’ll mind if he’s quoted at length:

“But Phil Booth, national coordinator of the No2ID campaign monitoring the government’s ID card and data sharing plans, described the merger as “chilling.”

It was “deeply worrying” that the GRO, a “formerly independent agency should be subsumed in this way, with no debate and for no apparent reason other than bureaucratic convenience,’ he said.

Birth and death dates might form part of an individual’s official identity, but register offices also recorded other information such as details about parents, Booth pointed out.

“The ID program is insinuating itself deeper and deeper into people’s lives. This is not so much ‘feature creep’ as a blatant land-grab of personal identity.

“That an agency which until a little over a year ago was limited to issuing passports is now grabbing control of citizen data from cradle to grave, and openly talks about ‘registration of life events,’ confirms what NO2ID has said all along. It’s not about ID cards, but the creation of a detailed, lifelong government dossier on every person,” Booth said.

He added “And that this sits in the dysfunctional and acquisitive culture of the Home Office should certainly make people think twice.”

Could Be Very Good

Posted by Tom Fuller in AnonymitY, Blindside project, data mining, databases, psychology at September 23rd, 2007

Via Computer Weekly, we see that “The London Borough of Brent is working on a project to provide a single view of residents’ data which will allow the council to improve customer service and the overall accuracy of council records. When complete in November, the project will allow Brent to conduct customer profiling in order to improve council services and offer additional services to residents. It will also help Brent comply with the Data Protection Act, which requires that information stored on an individual should be accurate.”

This could be very good. “The project has involved mapping out which systems hold the most accurate information. Customer data is extracted from the nine core council systems each night. The Initiate tool then matches customer records from each of these systems and links them together to form a master index of all customer information called the Client Index. Aside from building the master customer record the project also includes identifying change of circumstances eg change of address that have been recorded on council systems. All changes are passed back to council departments to ensure their systems are kept up to date. ”

Does anyone else notice that UK local governments have been leading the way for a couple of years?

It’s Not Only Government Working Through Privacy Issues… (Google Version)

Posted by Tom Fuller in Blindside project, Data breaches, data mining at September 17th, 2007

Via Crooked Timber: “Google is staking a claim on the moral high ground of Internet privacy. The company has called for new international rules, ostensibly to protect privacy online. Little of Google’s search information is strictly ‘personal data’, i.e. data directly concerning named individuals. But search data, potentially tied to individuals’ IP numbers, is dynamite, something it’s taken Google a long time to face up to publicly. ”

As really serious bloggers are wont to say, Read The Whole Thing.

Money Quote: “It can’t be controversial to infer from all this that in the current climate, any changes to data protection will focus more on accommodating business and law enforcement concerns than privacy ones. Opening up data protection negotiations anywhere – in the EU, at the OECD or at some UN forum to be imagined – can only have the effect of weakening existing protections.”

IA in a Mobile Age

We have tended here to concentrate on protecting information flows through computer networks. This is in part because there is so much work still to be done in this area, but I think also in part because most Blindsiders are of a computer-centric generation (you may well say ’speak for yourself, Fuller’, and I’ll eat humble pie).

However, mobile computing is growing faster than just about anything that gets measured in tech terms (well, except for Larry Ellison’s ego…) and I am personally convinced that a combination of mobile computing, location-based services and pervasive computing is going to explode onto the scene, offering new possibilities and new threats. I not only believe this–the success of my private pension scheme depends on it.

I think the day is coming very fast when the fact that I sit in a room at a desktop will instantly identify me as a grumpy old man (I think women will adapt to the new paradigm without much fuss). I think mobile devices with Japanese butterfly fan screens that fold up will move computing outside the converted second bedroom and into the street, and flash memory lapel pins will hold more information than my laptop.

It’s all going to be great fun, and I’m looking forward to it. But one reason I think it’s going to be fun is the fact that I’m not charged with assuring information flows within a government organisation. I think the number of nodes in organisational networks is set to grow logarythmically and that the edges of networks are going to blur dramatically.

I think IA specialists in 10 years are going to reminisce fondly about how life was so simple in 2007, before they had to build concentric circles of protection and build data hierarchies that have to exist in different forms within each circle.

For all of us who have retirement in mind before 2017, we may breathe a sigh of relief that it won’t happen on our watch (although it still may). And it might be fair to say that a fairly large share of Blindsiders fall within this group. But I think we owe it to the next generation of information assurance professionals to set the stage for them.

When memory becomes so small and cheap that your life fits into your belt buckle, when people will normally carry four or five objects on their person that have network connectivity, when hundreds of services offer local data based on segmentation rather than aggregation, when p2p dating services sit next to real-time data flows from your banking and investment activity, when government networks imperceptibly bleed into and through a myriad of specialist networks, information assurance will take on a different meaning.

We are entering that period of time where the evolutionary explosion fills an environmental niche created by a new technology. The prelude is finished. It’s just a bit funny that it’s not just one new technology–that computer science, biology, nanotechnology and whatever else I’m forgetting are coming of age at the same time.

Who needs science fiction?

Should Everyone Be On The DNA Database?

The reaction from (I think) almost everyone who contributes to the Blindside project would be no. However, after hearing our impassioned arguments, many in Government still believe it is in the UK’s best interests to order everyone in the UK to submit DNA to government for inclusion in a national database.

Instead of starting off with my reasons why I think this is a seriously flawed idea, I want to focus on the reasons why some think it is good–or at least necessary. I don’t believe that all who support a comprehensive DNA database are either evil or fools, and some clearly have given thought to this.

A national registry of DNA would help government perform some things more efficiently without requiring structural change. Currently, the national media keeps attention focused on certain major issues–crime, and to a lesser extent (this year at least), immigration. Government supporters of a DNA database evidently believe that it would help deal with those issues.

My argument (FWIW) against this is that a DNA database would help in solving crime and identifying current illegal immigrants, but would do much less in preventing crime and future illegal immigration. Similar arguments were advanced regarding CCTV’s potential for deterrence of crime, and these arguments proved invalid. CCTV has not deterred crime, but has helped identify criminals after the fact. I don’t think DNA DB would play out much differently. Hence, to me it seems a major sacrifice of personal liberty for a false hope. If a DNA database proves ineffective in dealing with crime and immigration, they will not throw away the DB in disgust.

But the current structure of police forces, with fewer cops on the beat actually deterring crime, has shifted its focus to high tech resolution of crime instead. A DNA database would allow them to keep the same structure, beefing it up and increasing their powers. A DNA DB would allow the judicial system, currently fighting a backlog at the same time it resists internal technological change, to be (it hopes) more efficient without, again, undergoing structural change.

The persistence of the desire for such a database in the face of all the problems that have been noted in the concept means to me that government feels besieged, not just by crime and immigration (which aren’t nearly as bad as the effects of media coverage of same), but by all the effects of the 20th and 21st centuries, and are searching for a silver bullet that will allow them to do things the way they want to do them.

There has been considerable reorganisation of government departments over the past 5 years, but it’s hard to avoid the impression that much of that has been name changing and seat shuffling. I think the most passionate advocates of a DNA database are really defending their way of life more than anything else.

I do think every discussion of a national DNA registry should include a brief summary of some of the most important objections to it:

1. Data will be entered incorrectly, lost or sold illegally. As the system gets used for more purposes, the effects will be fatal to some. Lives will be lost.

2. People will learn how to defeat the system, reducing its reliability. The most common means will be via corruption of civil servants.

3. The money spent on such a system, if redirected towards a more visible police presence in city centres on Saturday nights and at the principle points of entry into the UK, would actually reduce crime and illegal immigration to the extent that the DNA registry would not be necessary.

4. As currently constituted, the UK government is incapable of holding this information securely. It will be stolen. It will be sold.

5. Maintaining border security by identifying ‘legitimate’ citizens and assuming anyone not on the list is illegitimate will result in wide-scale violations of human rights and crimes against those who do not appear on the list.

I almost got through that list without mentioning human rights, and I didn’t talk about liberty either. They evidently are not a major consideration in this argument, so why beat a dead horse?

Let me just mention what I would support. A database for the NHS with voluntary contributions of DNA to assist in patient care. Mandatory DNA sampling of criminals convicted of a serious crime. That’s it.

And by the way, it should be obvious that arguments against a national DNA registry transfer without much modification to a National Identity Card Programme. As with a DNA registry, it is being proposed to benefit government, and the burden of proof needs to be placed squarely on the shoulders of its proponents.