Genealogy Ideas

timeline
History line
images
differing views
individual
parent/child
Connections
gui
drop down data to add
website additions
sourcing
geographical
Tree algorithm
Estimations facts
known facts
multi-source facts
word find automatic for all sourcing
stories by different family members with similar story lines
last names beginning with same letter spelled differently
Jobs multi for timeline
varying homes
varying marriages and divorces
varying children from varying marriages or illegitimate

Keywordsearch

Informational retrieval
Web content mining
Text Mining
data warehouse
mining customer data
extraction tool kit
?web structure mining?
web scraping
web data mining
clustering data after it's retrieved.



Paper: "On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach" by Steven Salzberg


On comparing classifiers: Pitfalls to avoid and a recommended approach

Data Mining and Knowledge Discovery, Vol. 1 (1997), pp. 317-327.by Steven L. Salzberg








below are some of the resources I couldn't download so I saved them.

Complex semantic web ontology mapping

Authors:
Nuno Silva
GECAD - Knowledge Engineering and Decision Support Research Group, Instituto Superior de Engenharia do Porto, 4200-072 Porto, Portugal

João Rocha
GECAD - Knowledge Engineering and Decision Support Research Group, Instituto Superior de Engenharia do Porto, 4200-072 Porto, Portugal


Published in:
· Journal
Web Intelligence and Agent Systems __archive__
Volume 1 Issue 3-4, December 2003
IOS Press Amsterdam, The Netherlands, The Netherlands
table of contents


Enhanced Graph Based Genealogical Record Linkage

Authors:
Cary Sweet
Department of Computer Science, University of Calgary, Calgary, Alberta,

Tansel Özyer
TOBB Ekonomi ve Teknoloji Üniversitesi, Ankara, Turkey

Reda Alhajj
Department of Computer Science, University of Calgary, Calgary, Alberta, and Department of Computer Science, Global University, Beirut, Lebanon


Published in:
· Proceeding
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Springer-Verlag Berlin, Heidelberg ©2007
table of contents ISBN: 978-3-540-73870-1 doi>__10.1007/978-3-540-73871-8_44__

B. Shaparenko and T. Joachims. Information genealogy: Uncovering the flow of ideas in non-hyperlinked document databases. In Knowledge Discovery and Data Mining (KDD) Conference, 2007.

ACM SIGSOFT Software Engineering Notes: Volume 30 Issue 4
July 2005

SIGSOFT Software Engineering Notes
Publisher: ACM


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif





Efficiently calculating inbreeding on large pedigrees databases
Brendan Elliott, __En Cheng__, __Stephen Mayes__, __Z. Meral Ozsoyoglu__
September 2009

Information Systems , Volume 34 Issue 6
Publisher: Elsevier Science Ltd.


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 1
blanks.gif
blanks.gif


We consider pedigree data structured in the form of a directed acyclic graph, and use an encoding scheme, called NodeCodes, for expediting the evaluation of queries on pedigree graph structures. Inbreeding is the quantitative measure of the genetic relationship ...

Keywords: Family NodeCodes, Inbreeding coefficients, NodeCodes, Pedigree

Survey on test collections and techniques for personal name matching
Patrick Reuther, __Bernd Walter__
January 2006

International Journal of Metadata, Semantics and Ontologies , Volume 1 Issue 2
Publisher: Inderscience Publishers


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 2
blanks.gif
blanks.gif


This paper gives an overview of personal name matching. Personal name matching is of great importance for all applications that deal with personal names. The problem with personal names is that they are not unique and sometimes even for one name many ...

Keywords: co-authorship networks, data test collections, duplicate detection, duplicates, name disambiguation, personal name matching, personal names, record linkage, semantics, social networks

Proactive control of manufacturing processes using historical data
Manfred Grauer, Sachin Karadgi, Ulf Müller, Daniel Metz, Walter Schäfer
September 2010

KES'10: Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
Publisher: Springer-Verlag


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


Today's enterprises have complex manufacturing processes with several automation systems. These systems generate enormous amount of data in real-time representing feedbacks, positions, and alerts, among others. This data can be stored in relational databases ...
Keywords: case-based reasoning, enterprise integration, historical data, manufacturing processes, proactive control, similarity metrics

Multi-source toponym data integration and mediation for a meta-gazetteer service
Philip D. Smart, Christopher B. Jones, Florian A. Twaroch
September 2010

GIScience'10: Proceedings of the 6th international conference on Geographic information science
Publisher: Springer-Verlag


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


A variety of gazetteers exist based on administrative or user contributed data. Each of these data sources has benefits for particular geographical analysis and information retrieval tasks but none is a one fit all solution. We present a mediation framework ...
Keywords: gazetteers, geo-web services, mediation architecture, place names, spatial data integration

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Pavel Berkhin, __Rich Caruana__, __Xindong Wu__
August 2007

KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Publisher: ACM


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


This proceedings is the published record of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-07) held in San Jose, California on August 12--15, 2007. The KDD-07 conference provides a forum for novel ...

Exploring personal media: A spatial interface supporting user-defined semantic regions
Hyunmo Kang, __Ben Shneiderman__
June 2006

Journal of Visual Languages and Computing , Volume 17 Issue 3
Publisher: Academic Press, Inc.


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


Graphical mechanisms for spatially organizing personal media data could enable users to fruitfully apply their conceptual models. This paper introduces Semantic regions, an innovative way for users to construct display representations of their conceptual ...

Keywords: Dynamic queries, Fling-and-flock, Personal media management, Spatial information management, User interfaces

From web data to entities and back
Zoltán Miklós, Nicolas Bonvin, Paolo Bouquet, Michele Catasta, Daniele Cordioli, Peter Fankhauser, Julien Gaugaz, Ekaterini Ioannou, Hristo Koshutanski, Antonio Maña, Claudia Niederée, Themis Palpanas, Heiko Stoermer
June 2010

CAiSE'10: Proceedings of the 22nd international conference on Advanced information systems engineering
Publisher: Springer-Verlag


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


We present the Entity Name System (ENS), an enabling infrastructure, which can host descriptions of named entities and provide unique identifiers, on large-scale. In this way, it opens new perspectives to realize entity-oriented, rather than keyword-oriented, ...
Keywords: entity, unique identifier, web

Categorisation of web documents using extraction ontologies
Li Xu, __David W. Embley__
November 2008

International Journal of Metadata, Semantics and Ontologies , Volume 3 Issue 1
Publisher: Inderscience Publishers


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


Automatically recognising which HTML documents on the Web contain items of interest for a user is non-trivial. As a step toward solving this problem, we propose an approach based on information-extraction ontologies. Given HTML documents, tables, and ...

Keywords: HTML documents, document categorisation, document classification, extraction ontologies, information extraction, information retrieval, internet, machine learning, web documents


Reference metadata extraction using a hierarchical knowledge representation framework
Min-Yuh Day, __Richard Tzong-Han Tsai__, __Cheng-Lung Sung__, __Chiu-Chen Hsieh__, __Cheng-Wei Lee__, __Shih-Hung Wu__, __Kun-Pin Wu__, __Chorng-Shyong Ong__, __Wen-Lian Hsu__
February 2007

Decision Support Systems , Volume 43 Issue 1
Publisher: Elsevier Science Publishers B. V.


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 3
blanks.gif
blanks.gif


The integration of bibliographical information on scholarly publications available on the Internet is an important task in the academic community. Accurate reference metadata extraction from such publications is essential for the integration of metadata ...

Keywords: INFOMAP, Knowledge representation framework, Metadata extraction, Reference extraction


Novel information discovery for intelligence and counterterrorism
D. B. Skillicorn, __N. Vats__
August 2007

Decision Support Systems , Volume 43 Issue 4
Publisher: Elsevier Science Publishers B. V.


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 1
blanks.gif
blanks.gif


Intelligence analysts construct hypotheses from large volumes of data, but are often limited by social and organizational norms and their own preconceptions and biases. The use of exploratory data mining technology can mitigate these limitations by requiring ...

Keywords: Al Qaeda, Counterterrorism, Information discovery, Intelligence analysis, Novelty

Discovering Document Semantics QBYS: A System for Querying the WWW by Semantics
Michael Johnson, __Farshad Fotouhi__, __Sorin Dr____ǎ____ghici__, __Ming Dong__, __Duo Xu__
November 2004

Multimedia Tools and Applications , Volume 24 Issue 2
Publisher: Kluwer Academic Publishers


Full text available:
publishers_site.jpg
publishers_site.jpg
__Publisher Site__



Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


This paper describes our research into a query-by-semantics approach to searching the World Wide Web. This research extends existing work, which had focused on a query-by-structure approach for the Web. We present a system that allows users to request ...

Keywords: document features, document type, neural network, query-by-semantics

A pattern recognition-based approach for phylogenetic network construction with constrained recombination
M. A. H. Zahid, __Ankush Mittal__, __R. C. Joshi__
December 2006

Pattern Recognition , Volume 39 Issue 12
Publisher: Elsevier Science Inc.


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 1
blanks.gif
blanks.gif


The tree representation of evolutionary relationship oversimplifies the view of the process of evolution as it cannot take into account the events such as horizontal gene transfer, hybridization, homoplasy and genetic recombination. Several algorithms ...

Keywords: Evolutionary relationship, Gall trees, Pattern recognition, Phylogenetic network, Recombination, SNP

Spam email filtering using network-level properties
Paulo Cortez, __André Correia__, __Pedro Sousa__, __Miguel Rocha__, __Miguel Rio__
July 2010

ICDM'10: Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Publisher: Springer-Verlag


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


Spam is serious problem that affects email users (e.g. phishing attacks, viruses and time spent reading unwanted messages). We propose a novel spam email filtering approach based on network-level attributes (e.g. the IP sender geographic coordinates) ...
Keywords: anti-spam filtering, naive bayes, support vector machines, text mining

Country wise classification of human names
Raju Balakrishnan
February 2006

AIKED'06: Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases
Publisher: World Scientific and Engineering Academy and Society (WSEAS)


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


Person names in a country follow a particular statistical trend and names of a large set of individuals in a country are derived from a set of names having smaller cardinality. The frequency distribution of person names of different countries varies ...

Keywords: Levenshtein distance, data mining, etymology, genealogy, k-nearest neighbor classification
Nichols, Johanna and Balthasar Bickel. 2009. The AUTOTYP genealogy and geography database: 2009 release. http://www.uni-leipzig. de/ ̃autotyp.

Discovery of spatial association rules in geo-referenced census data: A relational mining approach
Annalisa Appice, __Michelangelo Ceci__, __Antonietta Lanza__, __Francesca A. Lisi__, __Donato Malerba__
December 2003

Intelligent Data Analysis , Volume 7 Issue 6
Publisher: IOS Press


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 7
blanks.gif
blanks.gif


Census data mining has great potential both in business development and in good public policy, but still must be solved in this field a number of research issues. In this paper, problems related to the geo-referenciation of census data are considered. ...

Mining census data for spatial effects on mortality
Willi Klösgen, __Michael May__, __Jim Petch__
December 2003

Intelligent Data Analysis , Volume 7 Issue 6
Publisher: IOS Press


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 0
blanks.gif
blanks.gif


The paper describes a system for spatial data mining illustrating its features by an application to spatial census data. Using census data for data mining includes specific challenges. Because of data privacy regulations, census data are generally available ...

Keywords: census data, mortality rates, small areas, spatial data mining, subgroup mining


First-order temporal pattern mining with regular expression constraints
Sandra de Amo, __Daniel A. Furtado__
September 2007

Data & Knowledge Engineering , Volume 62 Issue 3
Publisher: Elsevier Science Publishers B. V.


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 6
blanks.gif
blanks.gif


Previous studies on mining sequential patterns have focused on temporal patterns specified by some form of propositional temporal logic. However, there are some interesting sequential patterns, such as the multi-sequential patterns, whose specification ...

Keywords: Constraint-based mining, Frequent sequential patterns, Regular expression constraints, Temporal data mining



Genetic algorithm based framework for mining fuzzy association rules
M. Kaya, __R. Alhajj__
June 2005

Fuzzy Sets and Systems , Volume 152 Issue 3
Publisher: Elsevier North-Holland, Inc.


Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 2
blanks.gif
blanks.gif


It is not an easy task to know a priori the most appropriate fuzzy sets that cover the domains of quantitative attributes for fuzzy association rules mining, simply because characteristics of quantitative data are in general unknown. Besides, it is unrealistic ...

Keywords: Association rules, CURE clustering algorithm, Data mining, Fuzzy sets, Genetic algorithms, Quantitative attributes


A Scalable Parallel Algorithm for Self-Organizing Maps with Applicationsto Sparse Data Mining Problems
R. D. Lawrence, __G. S. Almasi__, __H. E. Rushmeier__
June 1999

Data Mining and Knowledge Discovery , Volume 3 Issue 2
Publisher: Kluwer Academic Publishers


Full text available:
publishers_site.jpg
publishers_site.jpg
__Publisher Site__



Bibliometrics: Downloads (6 Weeks): n/a, Downloads (12 Months): n/a, Downloads (Overall): n/a, Citation Count: 5
blanks.gif
blanks.gif


We describe a scalable parallel implementation of the self organizing map (SOM) suitable for data-mining applications involving clustering or segmentation against large data sets such as those encountered in the analysis of customer spending patterns. The ...

Keywords: Kohonen self-organizing maps, clustering, data visualization, parallel IO, parallel processing, scalable data mining