DataBlog

Sponsor Media

DataBlog Resources

DataBlog Blogroll

  • Salesforce
    We deliver affordable hosted CRM solutions for enterprise, mid-market, and small business companies.
  • CA
    We're in the business of IT management software. Indeed, we have a clear vision of how organizations can manage complex IT environments across the enterprise to realize the full power of technology to drive business.
  • IBM Deep Computing
    Delivers innovative powerful open solutions to help you affordably address your demands for intense computation, visualization, or manipulation and management of massive amounts of data.
  • EMC
    Products, services, and solutions for information management and storage.
  • Fair Isaac
    Predictive analytics, decision management and credit management solutions. We offer unique data intelligence solutions to help businesses improve decision management through predictive analysis.
  • Business Objects
    Provider of enterprise business intelligence solutions, enabling organizations to track, understand, and manage enterprise performance.
  • Insightful
    Statistics and data mining software by Insightful: Powerful, usable statistics and data mining solutions.
  • Keane
    Keane is a business and IT services firm that delivers Application and Business Process Services.
  • Cognos
    Business intelligence & corporate performance management software for better business decisions.

Sponsor Links


My Online Status

Recent Comments

Sponsor Video

Data Blog, a data intelligence and analysis information blog.
Providing a daily dose of news and features from the world of the predictive analytics, decision management, and data mining industry.

Continue reading "" »

Life Expectancy: World Health Statistics Data

Whostat2007_cover_tn

A boy born in San Marino, a tiny republic surrounded by Italy, will likely live to age 80, the world's longest male life expectancy, but newborn girls in Japan and 30 other countries have even better prospects. Females in Japan, who traditionally lead the world tables, have a life expectancy of 86 years, the same as last year's statistics. San Marino men, who tied with Japanese men last year at 79, added a year to get ahead.

Following San Marino on the male side were Australia, Iceland, Japan, Sweden and Switzerland at 79 years and then Canada, Israel, Italy, Monaco and Singapore at 78. France was tied for 12th place at 77 years with a group of countries including New Zealand and Britain. Germany was at 76 years. United States was among the countries that tied the U.S. for 33rd place at 75 years.

Countries with long-living women include Monaco, 85 years, and Andorra, Australia, France, Italy, San Marino, Spain and Switzerland at 84. Canada tied Iceland and Sweden at 83 years for women, and Germany was in a group at 82 years. Britain came in at 81 years. Costa Rica and Denmark tied the United States for 32nd place at 80 years.

» Database / Downloads [ Contribute: submit link / submit article / submit company ]

[ PDF ] View the ten global health highlights
[ PDF ] Review the indicator definitions and metadata

Google: I want to organise your daily life

Google’s ambition to maximise the personal information it holds on users is so great that the search engine envisages a day when it can tell people what jobs to take and how they might spend their days off. Eric Schmidt, Google’s chief executive, said gathering more personal data was a key way for Google to expand and the company believes that is the logical extension of its stated mission to organise the world’s information.

Asked how Google might look in five years’ time, Mr Schmidt said: “We are very early in the total information we have within Google. The algorithms will get better and we will get better at personalisation. “The goal is to enable Google users to be able to ask the question such as ‘What shall I do tomorrow?’ and ‘What job shall I take?’”

» ft.com [ Contribute: submit link / submit article / submit company ]

LifeLog: Professor puts his whole life online

There are already tons of pictures at Tracking Transience. Elahi will post about a hundred today — the rooms he sat in, the food he ate, the coffees he ordered. Poke around his site and you'll find more than 20,000 images stretching back three years. Elahi has documented nearly every waking hour of his life during that time. He posts copies of every debit card transaction, so you can see what he bought, where, and when. A GPS device in his pocket reports his real-time physical location on a map.

"I've discovered that the best way to protect your privacy is to give it away"

» wired.com [ Contribute: submit link / submit article / submit company ]

InfoBunker: nuclear hardened data center

InfoBunker is a Cold War era government command bunker converted into a data center. The Department of Defense built it to survive a "Maximum Probably Event," such as a 20-megaton nuclear explosion. The 65,000 square foot, mostly-underground facility is equipped with a Nuc/Bio/Chem air filtration system. Your data will be intact even if the rest of the Internet has been vaporized.

» infobunker.com [ Contribute: submit link / submit article / submit company ]

"Data storm" blamed for nuclear-plant shutdown

The U.S. House of Representative's Committee on Homeland Security called this week for the Nuclear Regulatory Commission (NRC) to further investigate the cause of excessive network traffic that shut down an Alabama nuclear plant.

During the incident, which happened last August at Unit 3 of the Browns Ferry nuclear power plant, operators manually shut down the reactor after two water recirculation pumps failed. The recirculation pumps control the flow of water through the reactor, and thus the power output of boiling-water reactors (BWRs) like Browns Ferry Unit 3. An investigation into the failure found that the controllers for the pumps locked up following a spike in data traffic -- referred to as a "data storm" in the NRC notice -- on the power plant's internal control system network. The deluge of data was apparently caused by a separate malfunctioning control device, known as a programmable logic controller (PLC).

» securityfocus.com [ Contribute: submit link / submit article / submit company ]

Workshop on Data Mining in Web 2.0 Environments

Users feel very attracted by currently emerging Web 2.0 environments, that allow to provide content in a simple, unrestricted, and ad hoc way. Providing annotations (such as tags) in a Web 2.0 like way is applicable to a wide range of resources and data types, such as web pages, images, multimedia, etc. There is, however, a disadvantage: the freedom to provide arbitrary (personal) content and tags in ubiquitous, uncoordinated ways results in very large amounts of poorly structured information. Behind the current hype around Web 2.0 applications, this raises several important challenges for future data and web mining methods.

The workshop aims to bring together researchers and professionals in the areas of data and web mining, information systems and collaborative systems to discuss challenges and solutions of applying data mining to highly unstructured, user created data. Such challenges include the analysis of loosely-coupled snippets of information, such as overlapping tag structures, homonym or synonym tags, blog networks etc. Other challenges arise from scalability issues or new forms of fraud and spam. They demand, for instance, innovative methods of tag clustering, filtering, aggregation, personalization and visualization. Topics of interest include but are not limited to:

  • analysis of blogs
  • tag clustering and visualization
  • synonym and homonym resolution in tags
  • visual and textual information extraction
  • temporal analysis
  • data streams, trend detection, and concept drift
  • application of web and text mining to wiki content
  • discovering social structures and communities
  • evolution of online social networks
  • predicting user behavior
  • analysis of dynamic networks
  • discovering misuse and fraud
  • combining the web with data from other sources, mining with mashups
  • deriving profiles from usage
  • personalized delivery of information
  • applications, case studies

International Workshop on Data Mining in Web 2.0 Environments held in conjunction with the IEEE International Conference on Data Mining (ICDM 2007) on October 28, 2007 in Omaha, United States.

[ PDF ] call for papers » uni-kassel.de [ Contribute: submit link / submit article / submit company ]

PBS: Spying on the Home Front / Domestic surveillance datalogging

Last night, PBS Frontline aired Spying on the Home Front, devoted to all the ways the US government is spying on us.

H_v1rtqert

9/11 has indelibly altered America in ways that people are now starting to earnestly question: not only perpetual orange alerts, barricades and body frisks at the airport, but greater government scrutiny of people's records and electronic surveillance of their communications. The watershed, officials tell FRONTLINE, was the government's shift after 9/11 to a strategy of pre-emption at home -- not just prosecuting terrorists for breaking the law, but trying to find and stop them before they strike.

President Bush described his anti-terrorist measures as narrow and targeted, but a FRONTLINE investigation has found that the National Security Agency (NSA) has engaged in wiretapping and sifting Internet communications of millions of Americans; the FBI conducted a data sweep on 250,000 Las Vegas vacationers, and along with more than 50 other agencies, they are mining commercial-sector data banks to an unprecedented degree.

H_v2fgadfg

Even government officials with experience since 9/11 are nagged by anxiety about the jeopardy that a war without end against unseen terrorists poses to our way of life, our personal freedoms. "I always said, when I was in my position running counterterrorism operations for the FBI, 'How much security do you want, and how many rights do you want to give up?'" Larry Mefford, former assistant FBI director, tells Smith. "I can give you more security, but I've got to take away some rights. … Personally, I want to live in a country where you have a common-sense, fair balance, because I'm worried about people that are untrained, unsupervised, doing things with good intentions but, at the end of the day, harm our liberties."

Although the president told the nation that his NSA eavesdropping program was limited to known Al Qaeda agents or supporters abroad making calls into the U.S., comments of other administration officials and intelligence veterans indicate that the NSA cast its net far more widely. AT&T technician Mark Klein inadvertently discovered that the whole flow of Internet traffic in several AT&T operations centers was being regularly diverted to the NSA, a charge indirectly substantiated by John Yoo, the Justice Department lawyer who wrote the official legal memos legitimizing the president's warrantless wiretapping program. Yoo told FRONTLINE: "The government needs to have access to international communications so that it can try to find communications that are coming into the country where Al Qaeda's trying to send messages to cell members in the country. In order to do that, it does have to have access to communication networks."

Spying on the Home Front also looks at a massive FBI data sweep in December 2003. On a tip that Al Qaeda "might have an interest in Las Vegas" around New Year's 2004, the FBI demanded records from all hotels, airlines, rental car agencies, casinos and other businesses on every person who visited Las Vegas in the run-up to the holiday. Stephen Sprouse and Kristin Douglas of Kansas City, Mo., object to being caught in the FBI dragnet in Las Vegas just because they happened to get married there at the wrong moment. Says Douglas, "I'm sure that the government does a lot of things that I don't know about, and I've always been OK with that -- until I found out that I was included."

A check of all 250,000 Las Vegas visitors against terrorist watch lists turned up no known terrorist suspects or associates of suspects. The FBI told FRONTLINE that the records had been kept for more than two years, but have now all been destroyed.

In the broad reach of NSA eavesdropping, the massive FBI data sweep in Las Vegas, access to records gathered by private database companies that allows government agencies to avoid the limitations provided by the Privacy Act, and nearly 200 other government data-mining programs identified by the Government Accounting Office, experienced national security officials and government attorneys see a troubling and potentially dangerous collision between the strategy of pre-emption and the Fourth Amendment's protections against unreasonable search and seizure.

Peter Swire, a law professor and former White House privacy adviser to President Clinton, tells FRONTLINE that since 9/11 the government has been moving away from the traditional legal standard of investigations based on individual suspicion to generalized suspicion. The new standard, Swire says, is: "Check everybody. Everybody is a suspect."

» pbs.org [ Contribute: submit link / submit article / submit company ]

IBM Many Eyes: Collective intelligence for insight and analysis

Many Eyes is a bet on the power of human visual intelligence to find patterns. Our goal is to "democratize" visualization and to enable a new social kind of data analysis. Jump right to our visualizations now, take a tour, or read on for a leisurely explanation of the project.

As visualization designers we have witnessed and experienced many of those wondrous sparks. But in recent years, we have become acutely aware that the visualizations and the sparks they generate, take on new value in a social setting. Visualization is a catalyst for discussion and collective insight about data. We all deal with data that we'd like to understand better.

» alphaworks.ibm.com [ Contribute: submit link / submit article / submit company ]

Today is America's wiretap the Internet day

May 14th is the official deadline for cable modem companies, DSL providers, broadband over powerline, satellite internet companies and some universities to finish wiring up their networks with FBI-friendly surveillance gear, to comply with the FCC's expanded interpretation of the Communications Assistance for Law Enforcement Act.

Congress passed CALEA in 1994 to help FBI eavesdroppers deal with digital telecom technology. The law required phone companies to make their networks easier to wiretap. The results: on mobile phone networks, where CALEA tech has 100% penetration, it's credited with boosting the number of court-approved wiretaps a carrier can handle simultaneously, and greatly shortening the time it takes to get a wiretap going. Cops can now start listening in less than a day.

» wired.com [ Contribute: submit link / submit article / submit company ]

eBay Launches Feedback 2.0 System

eBay's (EBAY) latest townhall meeting featured Brian Burke, the manager and visionary behind Ebay's recently launched Feedback 2.0 system. Burke explained how the new system improves the buying experience and increases conversion rates (at least in tests).

Feedback 2.0 enhances eBay's current rating system by adding more detail (the current system is nearly worthless because so many sellers have 98%+positive feedback). Buyers can now rate sellers on four criteria: item description, communications, shipping time, and shipping/handling charges. Each category gets a 1-to-5 rating, and the average score is displayed on the seller's Feedback page. The goal is to drive traffic from bad sellers to good sellers and help all sellers improve.

[ PDF ] Transcript [ mp3 ] listen or download

How do numbers begin? (The first digit law)

Does your house address start with a 1? According to a strange mathematical law, about 1/3 of house numbers have 1 as their first digit. The same holds true for many other areas that have almost nothing in common: the Dow Jones index history, size of files stored on a PC, the length of the world’s rivers, the numbers in newspapers’ front page headlines, and many more.

The law is called Benford’s law after its (second) founder, Frank Benford, who discovered it in 1935 as a physicist at General Electric. The law tells how often each number (from 1 to 9) appears as the first significant digit in a very diverse range of data sets.

Besides the number 1 consistently appearing about 1/3 of the time, number 2 appears with a frequency of 17.6%, number 3 at 12.5%, on down to number 9 at 4.6%. In mathematical terms, this logarithmic law is written as F(d) = log[1 + (1/d)], where F is the frequency and d is the digit in question.

» physorg.com / » iop.org for PDF

The Art of Forgetting in the Age of Ubiquitous Computing

Mayer-Schönberger lays out his idea in a faculty research working paper called "Useful Void: The Art of Forgetting in the Age of Ubiquitous Computing," where he describes his plan as reinstating "the default of forgetting our societies have experienced for millennia."

Why would we want our machines to "forget"? Mayer-Schönberger suggests that we are creating a Benthamist panopticon by archiving so many bits of knowledge for so long. The accumulated weight of stored Google searches, thousands of family photographs, millions of books, credit bureau information, air travel reservations, massive government databases, archived e-mail, etc., can actually be a detriment to speech and action, he argues.

"If whatever we do can be held against us years later, if all our impulsive comments are preserved, they can easily be combined into a composite picture of ourselves," he writes in the paper. "Afraid how our words and actions may be perceived years later and taken out of context, the lack of forgetting may prompt us to speak less freely and openly."

[ PDF ] Useful Void: The Art of Forgetting in the Age of Ubiquitous Computing

Semantic web: high-speed RDF search engine developed

Irish researchers have developed a new high-speed RDF search engine capable of answering search queries with more than seven billion RDF statements in mere fractions of a second.

"The importance of this breakthrough cannot be overestimated,' said Professor Stefan Decker, director of DERI. 'These results enable us to create web search engines that really deliver answers instead of links. The technology also allows us to combine information from the web, for example the engine can list all partnerships of a company even if there is no single web page that lists all of them."

» the register

Tim Berners-Lee on the Semantic Web

The inventor of the World Wide Web explains how the Semantic Web works and how it will transform how we use and understand data.

[ Video ] view presentation

Map: Blogosphere as social network

blogosphere

Discover Magazine has an interesting article on mapping the blogosphere, reporting on the work of Matthew Hurst. Hurst put together a 3D map of the blogosphere, with bright spots represent sites with the highest number of links and isolated islands represent closed communities like LiveJournal. The study also identifies other islands like sociopolitical commentary, gadget hounds, sports fans, and, um, porn blogs.

The blogosphere is the most explosive social network you’ll never see. Recent studies suggest that nearly 60 million blogs exist online, and about 175,000 more crop up daily (that’s about 2 every second). Even though the vast majority of blogs are either abandoned or isolated, many bloggers like to link to other Web sites. These links allow analysts to track trends in blogs and identify the most popular topics of data exchange. Social media expert Matthew Hurst recently collected link data for six weeks and produced this plot of the most active and interconnected parts of the blogosphere.

» discovermagazine
» map graphic

CitySense, an urban scale sensor network

Citysense-map-small

CitySense is an urban scale sensor network testbed that is being developed by researchers at Harvard University and BBN Technologies. CitySense will consist of 100 wireless sensors deployed on light poles around the city of Cambridge, MA. Each node will consist of an embedded PC, 802.11a/b/g interface, and various sensors for monitoring weather conditions and air pollutants. Most importantly, CitySense is intended to be an open testbed that researchers from all over the world can use to evaluate wireless networking and sensor network applications in a large-scale urban setting.

» citysense.net » wireless sensor networks

Grid Computes 420 Years Worth of Data in 4 Months

By running the problem across 5,000 computer for a total of four months, the WISDOM project analyzed some 80,000 drug compounds every hour. The search for new drug compounds is normally a time-intensive process, but the grid approach did the work of 420 years of computation in just 16 weeks. Individuals in over 25 countries participated.

Up to 5,000 computers were used at any one time, generating a total of 2,000GB of useful data. More than 140 million compounds were processed by the end of the four months, and results are expected to speed up and reduce the costs involved in searching for an anti-malaria drug.

"Drug development is a very long process, typically [lasting] 12 years and [costing] US$800 million," said Vincent Breton, a research associate at the French National Centre for Scientific Research who worked on the project. "What WISDOM shows is that the first step of this process which is drug discovery can be significantly accelerated and made cheaper using grids. This is particularly relevant to neglected diseases which suffer a lack of R&D [Research and Development]."

» Computerworld

MyLifeBits: The era of digital memories

New systems may allow people to record everything they see and hear--and even things they cannot sense--and to store all these data in a personal digital archive.

Human memory can be maddeningly elusive. We stumble upon its limitations every day, when we forget a friend's telephone number, the name of a business contact or the title of a favorite book. People have developed a variety of strategies for combating forgetfulness--messages scribbled on Post-it notes, for example, or electronic address books carried in handheld devices--but important information continues to slip through the cracks. Recently, however, our team at Microsoft Research has begun a quest to digitally chronicle every aspect of a person's life, starting with one of our own lives (Bell's). For the past six years, we have attempted to record all of Bell's communications with other people and machines, as well as the images he sees, the sounds he hears and the Web sites he visits--storing everything in a personal digital archive that is both searchable and secure.

Digital memories can do more than simply assist the recollection of past events, conversations and projects. Portable sensors can take readings of things that are not even perceived by humans, such as oxygen levels in the blood or the amount of carbon dioxide in the air. Computers can then scan these data to identify patterns: for instance, they might determine which environmental conditions worsen a child's asthma. Sensors can also log the three billion or so heartbeats in a person's lifetime, along with other physiological indicators, and warn of a possible heart attack. This information would allow doctors to spot irregularities early, providing warnings before an illness becomes serious. Your physician would have access to a detailed, ongoing health record, and you would no longer have to rack your brain to answer questions such as "When did you first feel this way?"

» sciam.com [ Contribute: submit link / submit article / submit company ]

Surging internet services buoy Cisco

A surge in demand for bandwidth-hogging internet video services gave Cisco Systems a shot in the arm on Wednesday as the world's biggest maker of data networking equipment reported a 27 per cent jump in quarterly profit.

John Chambers, chief executive, hailed the results. He said Cisco was "in the midst of a market inflection that is changing the landscape of networking" as the internet emerges as thepreferred platform for the delivery of voice, video and data services.

» Financial Times

Metacrap: Putting the torch to seven straw-men of the meta-utopia

Metadata is "data about data" -- information like keywords, page-length, title, word-count, abstract, location, SKU, ISBN, and so on. Explicit, human-generated metadata has enjoyed recent trendiness, especially in the world of XML. A typical scenario goes like this: a number of suppliers get together and agree on a metadata standard -- a Document Type Definition or scheme -- for a given subject area, say washing machines. They agree to a common vocabulary for describing washing machines: size, capacity, energy consumption, water consumption, price. They create machine-readable databases of their inventory, which are available in whole or part to search agents and other databases, so that a consumer can enter the parameters of the washing machine he's seeking and query multiple sites simultaneously for an exhaustive list of the available washing machines that meet his criteria.

If everyone would subscribe to such a system and create good metadata for the purposes of describing their goods, services and information, it would be a trivial matter to search the Internet for highly qualified, context-sensitive results: a fan could find all the downloadable music in a given genre, a manufacturer could efficiently discover suppliers, travelers could easily choose a hotel room for an upcoming trip.

A world of exhaustive, reliable metadata would be a utopia. It's also a pipe-dream, founded on self-delusion, nerd hubris and hysterically inflated market opportunities.

There are at least seven insurmountable obstacles between the world as we know it and meta-utopia. I'll enumerate them below:

  1. People lie
  2. People are lazy
  3. People are stupid
  4.  Mission: Impossible -- know thyself
  5. Schemas aren't neutral
  6. Metrics influence results
  7. There's more than one way to describe something

» Cory Doctorow

Data Blog Modeling

In information system design, data modeling is the analysis and design of the information in the system, concentrating on the logical entities and the logical dependencies between these entities. Data modeling is an abstraction activity in that the details of the values of individual data observations are ignored in favor of the structure, relationships, names and formats of the data of interest, although a list of valid values is frequently recorded. It is by the data model that definitions of what the data means is related to the data structures.

While a common term for this activity is "data analysis" the activity actually has more in common with the ideas and methods of synthesis (putting things together) than it does in the original meaning of the term analysis (taking things apart). This is because the activity strives to bring the data structures of interest together in a cohesive, inseparable, whole by eliminating unnecessary data redundancies and relating data structures by relationships.

In the early phases of a software development project, emphasis will be on the design of a conceptual data model. This can be detailed into a logical data model sometimes called a functional data model. In later stages, this model may be translated into physical data model.

Several techniques have been developed for the design of a data models. Most noticeable are:

• RM/T
• Bachman diagrams
• Entity-relationship diagrams
• Object Role Modeling (ORM) or NIAM
• Object-relationship modeling

via [ DataBlog ]

Data Mining + Data Dredging

Data mining, also known as knowledge-discovery in databases (KDD), is the practice of automatically searching large stores of data for patterns. To do this, data mining uses computational techniques from statistics and pattern recognition.

A simple example of data mining is its use in a retail sales department. If a store tracks the purchases of a customer and notices that a customer buys a lot of silk shirts, the data mining system will make a correlation between that customer and silk shirts. The sales department will look at that information and may begin direct mail marketing of silk shirts to that customer, or it may alternatively attempt to get the customer to buy a wider range of products. In this case, the data mining system used by the retail store discovered new information about the customer that was previously unknown to the company.

Data Dredging

Used in the technical context of data warehousing and analysis, the term "data mining" is neutral. However, it sometimes has a more pejorative usage that implies imposing patterns (and particularly causal relationships) on data where none exist. This imposition of irrelevant, misleading or trivial attribute correlation is more properly criticized as "data dredging" in the statistical literature.

Used in this latter sense, data dredging implies scanning the data for any relationships, and then when one is found coming up with an interesting explanation. (This is also referred to as "overfitting the model".) The problem is that large data sets invariably happen to have some exciting relationships peculiar to that data. Therefore any conclusions reached are likely to be highly suspect. In spite of this, some exploratory data work is always required in any applied statistical analysis to get a feel for the data, so sometimes the line between good statistical practice and data dredging is less than clear.

via [ Data Blog ]

Data Blog Statistical Methods

The basic goal of a statistical research project is to make a conclusion on the effect of changes of an independent variable on a dependent variable. There are two major types of statistical studies, experimental studies and post facto or after the fact studies. In both of these types of studies, the effect of changes of an independent variable on the behavior of the dependent variable are observed. The difference between the two is in how the study is actually conducted.

An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation may have modified the values of the measurements. A post-facto study involves reviewing existing data and making a determination about a correlation between two measurements.

There are four types of measurements or measurement scales used in statistics. The four types or level of measurement have different degrees of usefulness in statistical research. Typically, the most appropriate level of measurement is ratio measurement since ratio measurement provides the greatest flexibility in statistical methods that can be used for analysing the data. Interval measurement, such as IQ measurements or temperature measurements in degrees Celsius, is also used in statistical research.

The basic steps for any statistical research involves

1. plan the research including determining information sources, research subject selection, and ethical considerations for the proposed research and method,

2. design the experiment concentrating on the system model and the interaction of independent and dependent variables,

3. summarize a collection of observations to feature their commonality by suppressing details (descriptive statistics),

4. reach consensus about what the observations tell us about the world we observe (statistical inference),

5. document the results of the study.

Probability

The probability of an event is often defined as a number between one and zero. In reality however there is virtually nothing that has a probability of 1 or 0. You could say that the sun will certainly rise in the morning, but what if an extremely unlikely event destroys the sun? What if there is a nuclear war and the sky is covered in ash and smoke?

We often round the probability of such things up or down because they are so likely or unlikely to occur, that it's easier to recognize them as a probability of one or zero.

via [ DataBlog ]

Keyword Tags

DataBlog (tm) dot com / Blogging In Numbers

• Data Management
• Data Mining
• Data Modeling
• Database
• Metadata
• Biostatistics
• Business Statistics
• Economic Statistics

• Artificial Intelligence
• Artificial Neural Network
• Business Intelligence
• Decision Tree
• Fuzzy Logic
• Hypothesis Testing
• Nearest Neighbor (pattern recognition)
• Regression Analysis
• Relational Data Mining