数据分析初步完成。
113
analysis_results/analysis_report.html
Normal file
@ -0,0 +1,113 @@
|
|||||||
|
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8">
|
||||||
|
<title>MovieLens Dataset Analysis Report</title>
|
||||||
|
<style>
|
||||||
|
body { font-family: Arial, sans-serif; margin: 0; padding: 20px; color: #333; }
|
||||||
|
.container { max-width: 1200px; margin: 0 auto; }
|
||||||
|
h1 { color: #2c3e50; text-align: center; margin-bottom: 30px; }
|
||||||
|
h2 { color: #3498db; margin-top: 30px; border-bottom: 1px solid #eee; padding-bottom: 10px; }
|
||||||
|
.summary { background-color: #f9f9f9; padding: 15px; border-radius: 5px; margin-bottom: 20px; }
|
||||||
|
.gallery { display: flex; flex-wrap: wrap; gap: 20px; justify-content: center; margin-top: 20px; }
|
||||||
|
.gallery img { max-width: 100%; height: auto; border-radius: 5px; box-shadow: 0 2px 5px rgba(0,0,0,0.1); }
|
||||||
|
.figure { margin-bottom: 30px; text-align: center; }
|
||||||
|
.figure img { max-width: 100%; height: auto; border-radius: 5px; box-shadow: 0 2px 5px rgba(0,0,0,0.1); }
|
||||||
|
.figure .caption { margin-top: 10px; font-style: italic; color: #666; }
|
||||||
|
table { width: 100%; border-collapse: collapse; margin: 20px 0; }
|
||||||
|
th, td { padding: 10px; text-align: left; border-bottom: 1px solid #ddd; }
|
||||||
|
th { background-color: #f2f2f2; }
|
||||||
|
tr:hover { background-color: #f5f5f5; }
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="container">
|
||||||
|
<h1>MovieLens Dataset User-Movie Preference Analysis Report</h1>
|
||||||
|
|
||||||
|
<div class="summary">
|
||||||
|
<h2>Data Overview</h2>
|
||||||
|
<p>This analysis is based on the MovieLens dataset, containing 6040 users、3883 movies and 1000209 original rating records。</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h2>User Profile Analysis</h2>
|
||||||
|
<div class="figure">
|
||||||
|
<img src="user_analysis/gender_distribution.png" alt="用户性别分布">
|
||||||
|
<p class="caption">User Gender Distribution</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="figure">
|
||||||
|
<img src="user_analysis/age_distribution.png" alt="用户年龄分布">
|
||||||
|
<p class="caption">User Age Distribution</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="figure">
|
||||||
|
<img src="user_analysis/occupation_distribution.png" alt="用户职业分布">
|
||||||
|
<p class="caption">User Occupation Distribution</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h2>Movie Distribution Analysis</h2>
|
||||||
|
<div class="figure">
|
||||||
|
<img src="movie_analysis/genre_distribution.png" alt="电影类型分布">
|
||||||
|
<p class="caption">Movie Genre Distribution</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="figure">
|
||||||
|
<img src="movie_analysis/year_distribution.png" alt="电影发行年份分布">
|
||||||
|
<p class="caption">Movie Release Year Distribution</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="figure">
|
||||||
|
<img src="movie_analysis/most_rated_movies.png" alt="评分数量最多的电影">
|
||||||
|
<p class="caption">Top 20 Most Rated Movies</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h2>Rating Distribution Analysis</h2>
|
||||||
|
<div class="figure">
|
||||||
|
<img src="rating_analysis/rating_distribution.png" alt="评分分布">
|
||||||
|
<p class="caption">Rating Distribution</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="figure">
|
||||||
|
<img src="rating_analysis/genre_avg_ratings.png" alt="各类型电影的平均评分">
|
||||||
|
<p class="caption">Average Rating by Movie Genre</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="figure">
|
||||||
|
<img src="rating_analysis/top_rated_movies.png" alt="评分最高的电影">
|
||||||
|
<p class="caption">Top 20 Highest Rated Movies (min. 100 ratings)</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h2>User Characteristics and Movie Preferences</h2>
|
||||||
|
<div class="figure">
|
||||||
|
<img src="preference_analysis/gender_genre_heatmap.png" alt="不同性别的电影类型偏好">
|
||||||
|
<p class="caption">Movie Genre Preferences by Gender</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="figure">
|
||||||
|
<img src="preference_analysis/age_genre_heatmap.png" alt="不同年龄段的电影类型偏好">
|
||||||
|
<p class="caption">Movie Genre Preferences by Age Group</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="figure">
|
||||||
|
<img src="preference_analysis/age_year_heatmap.png" alt="不同年龄段对不同年代电影的偏好">
|
||||||
|
<p class="caption">Preferences for Movies by Decade Across Age Groups</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="figure">
|
||||||
|
<img src="preference_analysis/gender_avg_rating.png" alt="性别与平均评分的关系">
|
||||||
|
<p class="caption">Relationship Between Gender and Average Rating</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h2>Conclusions and Insights</h2>
|
||||||
|
<p>Through in-depth analysis of the MovieLens dataset, we found significant correlations between user characteristics (gender, age, occupation) and movie preferences. Key findings include:</p>
|
||||||
|
<ul>
|
||||||
|
<li>Significant differences in movie genre preferences between genders</li>
|
||||||
|
<li>Age influences how users rate movies from different decades</li>
|
||||||
|
<li>Occupational background correlates with genre preferences</li>
|
||||||
|
</ul>
|
||||||
|
<p>These findings provide valuable reference for designing movie recommendation systems and developing movie marketing strategies.</p>
|
||||||
|
</div>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
|
33
analysis_results/analysis_summary.json
Normal file
@ -0,0 +1,33 @@
|
|||||||
|
{
|
||||||
|
"Data Overview": {
|
||||||
|
"Number of Users": 6040,
|
||||||
|
"Number of Movies": 3883,
|
||||||
|
"Original Ratings Count": 1000209,
|
||||||
|
"Filled Ratings Count": 22384240
|
||||||
|
},
|
||||||
|
"User Analysis": {
|
||||||
|
"Gender Distribution": {
|
||||||
|
"M": 4331,
|
||||||
|
"F": 1709
|
||||||
|
},
|
||||||
|
"Age Distribution": {
|
||||||
|
"25-34": 2096,
|
||||||
|
"35-44": 1193,
|
||||||
|
"18-24": 1103,
|
||||||
|
"45-49": 550,
|
||||||
|
"50-55": 496,
|
||||||
|
"56+": 380,
|
||||||
|
"Under 18": 222
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"Rating Analysis": {
|
||||||
|
"Average Rating": 3.58,
|
||||||
|
"Rating Distribution": {
|
||||||
|
"1": 56174,
|
||||||
|
"2": 107557,
|
||||||
|
"3": 261197,
|
||||||
|
"4": 348971,
|
||||||
|
"5": 226310
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
BIN
analysis_results/movie_analysis/genre_avg_rating_counts.png
Normal file
After Width: | Height: | Size: 37 KiB |
BIN
analysis_results/movie_analysis/genre_distribution.png
Normal file
After Width: | Height: | Size: 35 KiB |
BIN
analysis_results/movie_analysis/most_rated_movies.png
Normal file
After Width: | Height: | Size: 77 KiB |
BIN
analysis_results/movie_analysis/movie_rating_counts.png
Normal file
After Width: | Height: | Size: 32 KiB |
BIN
analysis_results/movie_analysis/year_avg_rating_counts.png
Normal file
After Width: | Height: | Size: 86 KiB |
BIN
analysis_results/movie_analysis/year_distribution.png
Normal file
After Width: | Height: | Size: 28 KiB |
BIN
analysis_results/preference_analysis/age_18-24_preferences.png
Normal file
After Width: | Height: | Size: 23 KiB |
BIN
analysis_results/preference_analysis/age_25-34_preferences.png
Normal file
After Width: | Height: | Size: 24 KiB |
BIN
analysis_results/preference_analysis/age_35-44_preferences.png
Normal file
After Width: | Height: | Size: 24 KiB |
BIN
analysis_results/preference_analysis/age_45-49_preferences.png
Normal file
After Width: | Height: | Size: 23 KiB |
BIN
analysis_results/preference_analysis/age_50-55_preferences.png
Normal file
After Width: | Height: | Size: 23 KiB |
BIN
analysis_results/preference_analysis/age_56+_preferences.png
Normal file
After Width: | Height: | Size: 23 KiB |
After Width: | Height: | Size: 23 KiB |
BIN
analysis_results/preference_analysis/age_avg_rating.png
Normal file
After Width: | Height: | Size: 36 KiB |
BIN
analysis_results/preference_analysis/age_genre_heatmap.png
Normal file
After Width: | Height: | Size: 127 KiB |
BIN
analysis_results/preference_analysis/age_rating_std.png
Normal file
After Width: | Height: | Size: 39 KiB |
BIN
analysis_results/preference_analysis/age_year_heatmap.png
Normal file
After Width: | Height: | Size: 82 KiB |
BIN
analysis_results/preference_analysis/count_vs_avg_rating.png
Normal file
After Width: | Height: | Size: 334 KiB |
BIN
analysis_results/preference_analysis/gender_F_preferences.png
Normal file
After Width: | Height: | Size: 31 KiB |
BIN
analysis_results/preference_analysis/gender_M_preferences.png
Normal file
After Width: | Height: | Size: 30 KiB |
BIN
analysis_results/preference_analysis/gender_avg_rating.png
Normal file
After Width: | Height: | Size: 24 KiB |
BIN
analysis_results/preference_analysis/gender_genre_heatmap.png
Normal file
After Width: | Height: | Size: 61 KiB |
BIN
analysis_results/preference_analysis/gender_rating_std.png
Normal file
After Width: | Height: | Size: 24 KiB |
After Width: | Height: | Size: 23 KiB |
After Width: | Height: | Size: 22 KiB |
After Width: | Height: | Size: 23 KiB |
After Width: | Height: | Size: 23 KiB |
After Width: | Height: | Size: 22 KiB |
After Width: | Height: | Size: 22 KiB |
After Width: | Height: | Size: 22 KiB |
BIN
analysis_results/preference_analysis/occupation_avg_rating.png
Normal file
After Width: | Height: | Size: 66 KiB |
After Width: | Height: | Size: 128 KiB |
BIN
analysis_results/rating_analysis/genre_avg_ratings.png
Normal file
After Width: | Height: | Size: 45 KiB |
BIN
analysis_results/rating_analysis/original_vs_filled_ratings.png
Normal file
After Width: | Height: | Size: 26 KiB |
BIN
analysis_results/rating_analysis/rating_distribution.png
Normal file
After Width: | Height: | Size: 19 KiB |
BIN
analysis_results/rating_analysis/rating_trends_over_time.png
Normal file
After Width: | Height: | Size: 76 KiB |
BIN
analysis_results/rating_analysis/top_rated_movies.png
Normal file
After Width: | Height: | Size: 106 KiB |
BIN
analysis_results/rating_analysis/year_avg_ratings.png
Normal file
After Width: | Height: | Size: 129 KiB |
BIN
analysis_results/user_analysis/age_activity.png
Normal file
After Width: | Height: | Size: 41 KiB |
BIN
analysis_results/user_analysis/age_distribution.png
Normal file
After Width: | Height: | Size: 25 KiB |
BIN
analysis_results/user_analysis/gender_activity.png
Normal file
After Width: | Height: | Size: 21 KiB |
BIN
analysis_results/user_analysis/gender_age_distribution.png
Normal file
After Width: | Height: | Size: 23 KiB |
BIN
analysis_results/user_analysis/gender_distribution.png
Normal file
After Width: | Height: | Size: 20 KiB |
BIN
analysis_results/user_analysis/occupation_activity.png
Normal file
After Width: | Height: | Size: 82 KiB |
BIN
analysis_results/user_analysis/occupation_distribution.png
Normal file
After Width: | Height: | Size: 49 KiB |
BIN
analysis_results/user_analysis/rating_activity_distribution.png
Normal file
After Width: | Height: | Size: 29 KiB |
BIN
analysis_results/user_analysis/region_distribution.png
Normal file
After Width: | Height: | Size: 34 KiB |