The Experts below are selected from a list of 5985 Experts worldwide ranked by ideXlab platform
Christos Faloutsos - One of the best experts on this subject based on the ideXlab platform.
-
APWeb/WAIM - Data mining using fractals and power laws
2007Co-Authors: Christos FaloutsosAbstract:What patterns can we find in a bursty web traffic? On the web or on the internet graph itself? How about the distributions of galaxies in the sky, or the distribution of a company's Customers in geographical space? How long should we expect a nearest-neighbor search to take, when there are 100 attributes per patient or Customer Record? The traditional assumptions (uniformity, independence, Poisson arrivals, Gaussian distributions), often fail miserably. Should we give up trying to find patterns in such settings? Self-similarity, fractals and power laws are extremely successful in describing real datasets (coast-lines, rivers basins, stock-prices, brain-surfaces, communication-line noise, to name a few). We show some old and new successes, involving modeling of graph topologies (internet, web and social networks); modeling galaxy and video data; dimensionality reduction; and more.
-
SETN - Data mining using fractals and power laws
Advances in Artificial Intelligence, 2006Co-Authors: Christos FaloutsosAbstract:What patterns can we find in a bursty web traffic? On the web or on the internet graph itself? How about the distributions of galaxies in the sky, or the distribution of a company's Customers in geographical space? How long should we expect a nearest-neighbour search to take, when there are 100 attributes per patient or Customer Record? The traditional assumptions (uniformity, independence, Poisson arrivals, Gaussian distributions), often fail miserably. Should we give up trying to find patterns in such settings? Self-similarity, fractals and power laws are extremely successful in describing real datasets (coast-lines, rivers basins, stock-prices, brain-surfaces, communication-line noise, to name a few). We show some old and new successes, involving modeling of graph topologies (internet, web and social networks); modeling galaxy and video data; dimensionality reduction; and more.
-
ECML - Next generation data mining tools: power laws and self-similarity for graphs, streams and traditional data
Machine Learning: ECML 2003, 2003Co-Authors: Christos FaloutsosAbstract:What patterns can we find in a bursty web traffic? On the web or internet graph itself? How about the distributions of galaxies in the sky, or the distribution of a company's Customers in geographical space? How long should we expect a nearest-neighbor search to take, when there are 100 attributes per patient or Customer Record? The traditional assumptions (uniformity, independence, Poisson arrivals, Gaussian distributions), often fail miserably. Should we give up trying to find patterns in such settings? Self-similarity, fractals and power laws are extremely successful in describing real datasets (coast-lines, rivers basins, stock-prices, brain-surfaces, communication-line noise, to name a few). We show some old and new successes, involving modeling of graph topologies (internet, web and social networks); modeling galaxy and video data; dimensionality reduction; and more.
Donal Lyons - One of the best experts on this subject based on the ideXlab platform.
-
Tracing Individual Public Transport Customers from an Anonymous Transaction Database
Journal of Public Transportation, 2006Co-Authors: Gregory S. Tseytin, Markus Hofmann, Margaret O'mahony, Donal LyonsAbstract:Data mining concepts are frequently used throughout the transportation research sector. This paper examines the concept of the market basket technique as a means of gaining more insight into public transit users’ demands. The paper proposes a method that uses various data attributes of passenger Records to infer the same Customer in a different week (i.e., track the same Customer from week to week). The general idea behind the measure is that if 2 Records are considered similar, ideally every trip in one Customer Record should have a close counterpart in the other Record. The research develops a similarity function aimed at maximizing the percentage of positive ticket identification over a number of weeks. Once similarity has been established, Customer travel patterns can be useful in helping the operator identify new routes and timetables and strategic decisions in relation to satisfying public transit Customer demands.
Gregory S. Tseytin - One of the best experts on this subject based on the ideXlab platform.
-
Tracing Individual Public Transport Customers from an Anonymous Transaction Database
Journal of Public Transportation, 2006Co-Authors: Gregory S. Tseytin, Markus Hofmann, Margaret O'mahony, Donal LyonsAbstract:Data mining concepts are frequently used throughout the transportation research sector. This paper examines the concept of the market basket technique as a means of gaining more insight into public transit users’ demands. The paper proposes a method that uses various data attributes of passenger Records to infer the same Customer in a different week (i.e., track the same Customer from week to week). The general idea behind the measure is that if 2 Records are considered similar, ideally every trip in one Customer Record should have a close counterpart in the other Record. The research develops a similarity function aimed at maximizing the percentage of positive ticket identification over a number of weeks. Once similarity has been established, Customer travel patterns can be useful in helping the operator identify new routes and timetables and strategic decisions in relation to satisfying public transit Customer demands.
Ankur B Shah - One of the best experts on this subject based on the ideXlab platform.
-
reliance measurement technique in master data management mdm repositories and mdm repositories on clouded federated databases with linkages
2017Co-Authors: Ajay Arangali Raghavan, Ganesh Boggaram, Ankur B ShahAbstract:Provided are techniques for determining trustworthiness of data. Each Customer Record is assigned a single Customer view identifier in a master data management repository. Linkages are used to determine one or more suspect Records for each Customer Record. A score is assigned to each Customer Record based on the linkages to the one or more suspect Records, tag information associated with each Customer Record, and configuration information. A heat map is generated to provide an indication of trustworthiness of the multiple Records based on the reputation score associated with the origin of each Record.
Markus Hofmann - One of the best experts on this subject based on the ideXlab platform.
-
Tracing Individual Public Transport Customers from an Anonymous Transaction Database
Journal of Public Transportation, 2006Co-Authors: Gregory S. Tseytin, Markus Hofmann, Margaret O'mahony, Donal LyonsAbstract:Data mining concepts are frequently used throughout the transportation research sector. This paper examines the concept of the market basket technique as a means of gaining more insight into public transit users’ demands. The paper proposes a method that uses various data attributes of passenger Records to infer the same Customer in a different week (i.e., track the same Customer from week to week). The general idea behind the measure is that if 2 Records are considered similar, ideally every trip in one Customer Record should have a close counterpart in the other Record. The research develops a similarity function aimed at maximizing the percentage of positive ticket identification over a number of weeks. Once similarity has been established, Customer travel patterns can be useful in helping the operator identify new routes and timetables and strategic decisions in relation to satisfying public transit Customer demands.