Upload MovieLens 1M Dataset
This commit is contained in:
commit
115a711b24
170
dataset/README
Normal file
170
dataset/README
Normal file
@ -0,0 +1,170 @@
|
||||
SUMMARY
|
||||
================================================================================
|
||||
|
||||
These files contain 1,000,209 anonymous ratings of approximately 3,900 movies
|
||||
made by 6,040 MovieLens users who joined MovieLens in 2000.
|
||||
|
||||
USAGE LICENSE
|
||||
================================================================================
|
||||
|
||||
Neither the University of Minnesota nor any of the researchers
|
||||
involved can guarantee the correctness of the data, its suitability
|
||||
for any particular purpose, or the validity of results based on the
|
||||
use of the data set. The data set may be used for any research
|
||||
purposes under the following conditions:
|
||||
|
||||
* The user may not state or imply any endorsement from the
|
||||
University of Minnesota or the GroupLens Research Group.
|
||||
|
||||
* The user must acknowledge the use of the data set in
|
||||
publications resulting from the use of the data set
|
||||
(see below for citation information).
|
||||
|
||||
* The user may not redistribute the data without separate
|
||||
permission.
|
||||
|
||||
* The user may not use this information for any commercial or
|
||||
revenue-bearing purposes without first obtaining permission
|
||||
from a faculty member of the GroupLens Research Project at the
|
||||
University of Minnesota.
|
||||
|
||||
If you have any further questions or comments, please contact GroupLens
|
||||
<grouplens-info@cs.umn.edu>.
|
||||
|
||||
CITATION
|
||||
================================================================================
|
||||
|
||||
To acknowledge use of the dataset in publications, please cite the following
|
||||
paper:
|
||||
|
||||
F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History
|
||||
and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4,
|
||||
Article 19 (December 2015), 19 pages. DOI=http://dx.doi.org/10.1145/2827872
|
||||
|
||||
|
||||
ACKNOWLEDGEMENTS
|
||||
================================================================================
|
||||
|
||||
Thanks to Shyong Lam and Jon Herlocker for cleaning up and generating the data
|
||||
set.
|
||||
|
||||
FURTHER INFORMATION ABOUT THE GROUPLENS RESEARCH PROJECT
|
||||
================================================================================
|
||||
|
||||
The GroupLens Research Project is a research group in the Department of
|
||||
Computer Science and Engineering at the University of Minnesota. Members of
|
||||
the GroupLens Research Project are involved in many research projects related
|
||||
to the fields of information filtering, collaborative filtering, and
|
||||
recommender systems. The project is lead by professors John Riedl and Joseph
|
||||
Konstan. The project began to explore automated collaborative filtering in
|
||||
1992, but is most well known for its world wide trial of an automated
|
||||
collaborative filtering system for Usenet news in 1996. Since then the project
|
||||
has expanded its scope to research overall information filtering solutions,
|
||||
integrating in content-based methods as well as improving current collaborative
|
||||
filtering technology.
|
||||
|
||||
Further information on the GroupLens Research project, including research
|
||||
publications, can be found at the following web site:
|
||||
|
||||
http://www.grouplens.org/
|
||||
|
||||
GroupLens Research currently operates a movie recommender based on
|
||||
collaborative filtering:
|
||||
|
||||
http://www.movielens.org/
|
||||
|
||||
RATINGS FILE DESCRIPTION
|
||||
================================================================================
|
||||
|
||||
All ratings are contained in the file "ratings.dat" and are in the
|
||||
following format:
|
||||
|
||||
UserID::MovieID::Rating::Timestamp
|
||||
|
||||
- UserIDs range between 1 and 6040
|
||||
- MovieIDs range between 1 and 3952
|
||||
- Ratings are made on a 5-star scale (whole-star ratings only)
|
||||
- Timestamp is represented in seconds since the epoch as returned by time(2)
|
||||
- Each user has at least 20 ratings
|
||||
|
||||
USERS FILE DESCRIPTION
|
||||
================================================================================
|
||||
|
||||
User information is in the file "users.dat" and is in the following
|
||||
format:
|
||||
|
||||
UserID::Gender::Age::Occupation::Zip-code
|
||||
|
||||
All demographic information is provided voluntarily by the users and is
|
||||
not checked for accuracy. Only users who have provided some demographic
|
||||
information are included in this data set.
|
||||
|
||||
- Gender is denoted by a "M" for male and "F" for female
|
||||
- Age is chosen from the following ranges:
|
||||
|
||||
* 1: "Under 18"
|
||||
* 18: "18-24"
|
||||
* 25: "25-34"
|
||||
* 35: "35-44"
|
||||
* 45: "45-49"
|
||||
* 50: "50-55"
|
||||
* 56: "56+"
|
||||
|
||||
- Occupation is chosen from the following choices:
|
||||
|
||||
* 0: "other" or not specified
|
||||
* 1: "academic/educator"
|
||||
* 2: "artist"
|
||||
* 3: "clerical/admin"
|
||||
* 4: "college/grad student"
|
||||
* 5: "customer service"
|
||||
* 6: "doctor/health care"
|
||||
* 7: "executive/managerial"
|
||||
* 8: "farmer"
|
||||
* 9: "homemaker"
|
||||
* 10: "K-12 student"
|
||||
* 11: "lawyer"
|
||||
* 12: "programmer"
|
||||
* 13: "retired"
|
||||
* 14: "sales/marketing"
|
||||
* 15: "scientist"
|
||||
* 16: "self-employed"
|
||||
* 17: "technician/engineer"
|
||||
* 18: "tradesman/craftsman"
|
||||
* 19: "unemployed"
|
||||
* 20: "writer"
|
||||
|
||||
MOVIES FILE DESCRIPTION
|
||||
================================================================================
|
||||
|
||||
Movie information is in the file "movies.dat" and is in the following
|
||||
format:
|
||||
|
||||
MovieID::Title::Genres
|
||||
|
||||
- Titles are identical to titles provided by the IMDB (including
|
||||
year of release)
|
||||
- Genres are pipe-separated and are selected from the following genres:
|
||||
|
||||
* Action
|
||||
* Adventure
|
||||
* Animation
|
||||
* Children's
|
||||
* Comedy
|
||||
* Crime
|
||||
* Documentary
|
||||
* Drama
|
||||
* Fantasy
|
||||
* Film-Noir
|
||||
* Horror
|
||||
* Musical
|
||||
* Mystery
|
||||
* Romance
|
||||
* Sci-Fi
|
||||
* Thriller
|
||||
* War
|
||||
* Western
|
||||
|
||||
- Some MovieIDs do not correspond to a movie due to accidental duplicate
|
||||
entries and/or test entries
|
||||
- Movies are mostly entered by hand, so errors and inconsistencies may exist
|
3883
dataset/movies.dat
Normal file
3883
dataset/movies.dat
Normal file
File diff suppressed because it is too large
Load Diff
1000209
dataset/ratings.dat
Normal file
1000209
dataset/ratings.dat
Normal file
File diff suppressed because it is too large
Load Diff
6040
dataset/users.dat
Normal file
6040
dataset/users.dat
Normal file
File diff suppressed because it is too large
Load Diff
Loading…
x
Reference in New Issue
Block a user