Thursday, 8 May 2014

Publicly available PCAP files

This is a list of public packet capture repositories, which are freely available on the Internet.
Most of the sites listed below share their PCAP files as full content, but some do unfortunately only have truncated frames.


http://www.netresec.com/?page=PcapFiles

Data Mining: Books, research papers, tutorials, related links.

BOOKS:   A group member asked for some book recommendations.  ALL of the following are excellent introductory texts

Data Mining - Practical Machine Learning Tools and Techniques, Third Edition
by Ian H. Witten, Eibe Frank, and Mark A. Hall

Discovering Knowledge in Data - An Introduction to Data Mining
by Daniel T. Larose

Handbook of Statistical Analysis and Data Mining Applications
by Robert Nisbet, John Elder, and Gary Miner

Data Mining Techniques - For Marketing, Sales, and Customer Relationship Management, 2nd Edition
by Michael J.A. Berry and Gordon S. Linoff

Making Sense of Data II - A Practical Guide to Data Visualization, Advanced Data Mining Methods, and Applications
by Glenn J. Myatt and Wayne P. Johnson

Intelligent Data Analysis - An Introduction
by Michael R. Berthold and David J. Hand

Applied Data Mining - Statistical Methods for Business and Industry
by Paolo Giudici

Mining the Web: Transforming Customer Data into Customer Value
by Gordon S. Linoff and Michael J.A. Berry

Introduction to Data Mining
by Kumar, Steinbach and Tan

Data Mining Concepts and Techniques
by Jiawei Han and Micheline Kamber

GödelEscherBach: An Eternal Golden Braid
by Philip Douglas Hofstadter
Though not directly about DM/PA/ML/DS, this is a classic text that has inspired many.  It dives into subjects that interest us ... recursion and consciousness ... and it does it in a most inspiring way.  A guaranteed great read.


FREE BOOKS (PDF'S) All of these books are legal downloads.

The Elements of Statistical Learning - Data Mining, Inference, and Prediction
A highly recommended classic survey by Jerome Friedman, Trevor Hastie, and Robert Tibshirani

Social Media Mining: An Introduction
A recent textbook by Camrbdige University Press on Data Mining and Social Network Analysis in Social Media.  By Reza Zafarani, Mohammad Ali Abbasi and Huan Liu.
Download it here (free): http://dmml.asu.edu/smm/
Introduction to Information Retrieval
Another classic that presents important, rudimentary information on text retrieval and text mining.  By Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze.
Download it here (free): http://nlp.stanford.edu/IR-book/

Data Mining for the Masses
An overview of Data Mining that steps you through all the phases from objectives to implementation.  Demonstrations utilize RapidMiner .  By Matt North.

Top 10 Algorithms in Data Mining
by XindongWu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda, Geoffrey J. McLachlan,  Angus Ng, Bing Liu, Philip S. Yu, Zhi-Hua Zhou, Michael Steinbach, David J. Hand, Dan Steinberg
Download it here (free):  http://goo.gl/E7wXxG

Social Media Mining, An introduction
Social media mining is the process of representing, analyzing, and extracting meaningful patterns from data in social media, resulting from social interactions.  By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu.
Download it here (free): http://dmml.asu.edu/smm/book/

A Programmer's Guide to Data Mining
Simply and concisely-written.  Includes work-along Python scripts and datasets.
Download it here (free): http://guidetodatamining.com/

Prolog for Programmers
Prolog is a general purpose logic programming language associated with artificial intelligence and computational linguistics. 

InTech is a site for open source publications (books and papers), including DM/PA/ML/DS pubs:  http://www.intechopen.com/


TUTORIAL SITES


FREE VIDEO TRAINING

Subscribe to our YouTube Education Channel at:http://www.youtube.com/user/PredictiveModeling/
We gather a lot of education videos on this channel.  Among them are:
  • The Statistical Aspects of Data Mining (Google)
  • Machine Learning (Stanford, Andrew Ng)
  • Machine Learning (Nando de Freitas)
  • Software tutorials on R, Knime, Predixion, Systat, RapidMiner, Weka, Statistica and SAS
Also, if you know of some high quality Youtube tutorials, please let us know.  We'll add them to the channel playlists  

Data Mining Course Utilizing R and the Rattle Interface http://georgia-r-school.wistia.com/projects/5vyg37615i

VideoLectures.NET has excellent and generally more advanced DM/PA topical tutorials:  http://videolectures.net/


RESEARCH PAPERS


FREE WEB MINING SERVICE