全面修改图表绘制方法,确保中文字符正常输出。优化程序性能,修复_analyze_group_genre_preference函数中存在严重的逻辑错误。

This commit is contained in:
Cat Tom
2025-05-07 23:07:46 +08:00
parent dca1aec8cc
commit 9153124ff7
60 changed files with 308 additions and 91 deletions

View File

@ -3,7 +3,7 @@
<html>
<head>
<meta charset="utf-8">
<title>MovieLens Dataset Analysis Report</title>
<title>MovieLens数据集分析报告</title>
<style>
body { font-family: Arial, sans-serif; margin: 0; padding: 20px; color: #333; }
.container { max-width: 1200px; margin: 0 auto; }
@ -23,90 +23,90 @@
</head>
<body>
<div class="container">
<h1>MovieLens Dataset User-Movie Preference Analysis Report</h1>
<h1>MovieLens数据集用户-电影偏好分析报告</h1>
<div class="summary">
<h2>Data Overview</h2>
<p>This analysis is based on the MovieLens dataset, containing 6040 users、3883 movies and 1000209 original rating records</p>
<h2>数据概览</h2>
<p>本分析基于MovieLens数据集包含 6040 位用户、3883 部电影 和 1000209 条原始评分记录</p>
</div>
<h2>User Profile Analysis</h2>
<h2>用户基本情况分析</h2>
<div class="figure">
<img src="user_analysis/gender_distribution.png" alt="User Gender Distribution">
<p class="caption">User Gender Distribution</p>
<p class="caption">用户性别分布</p>
</div>
<div class="figure">
<img src="user_analysis/age_distribution.png" alt="User Age Distribution">
<p class="caption">User Age Distribution</p>
<p class="caption">用户年龄分布</p>
</div>
<div class="figure">
<img src="user_analysis/occupation_distribution.png" alt="User Occupation Distribution">
<p class="caption">User Occupation Distribution</p>
<p class="caption">用户职业分布</p>
</div>
<h2>Movie Distribution Analysis</h2>
<h2>电影分布情况分析</h2>
<div class="figure">
<img src="movie_analysis/genre_distribution.png" alt="Movie Genre Distribution">
<p class="caption">Movie Genre Distribution</p>
<p class="caption">电影类型分布</p>
</div>
<div class="figure">
<img src="movie_analysis/year_distribution.png" alt="Movie Release Year Distribution">
<p class="caption">Movie Release Year Distribution</p>
<p class="caption">电影发行年份分布</p>
</div>
<div class="figure">
<img src="movie_analysis/most_rated_movies.png" alt="Top 20 Most Rated Movies">
<p class="caption">Top 20 Most Rated Movies</p>
<p class="caption">评分数量最多的20部电影</p>
</div>
<h2>Rating Distribution Analysis</h2>
<h2>评分分布情况分析</h2>
<div class="figure">
<img src="rating_analysis/rating_distribution.png" alt="Rating Distribution">
<p class="caption">Rating Distribution</p>
<p class="caption">评分分布情况</p>
</div>
<div class="figure">
<img src="rating_analysis/genre_avg_ratings.png" alt="Average Rating by Movie Genre">
<p class="caption">Average Rating by Movie Genre</p>
<p class="caption">各类型电影的平均评分</p>
</div>
<div class="figure">
<img src="rating_analysis/top_rated_movies.png" alt="Top 20 Highest Rated Movies">
<p class="caption">Top 20 Highest Rated Movies (min. 100 ratings)</p>
<p class="caption">评分最高的20部电影至少有100个评分</p>
</div>
<h2>User Characteristics and Movie Preferences</h2>
<h2>用户特征与电影偏好分析</h2>
<div class="figure">
<img src="preference_analysis/gender_genre_heatmap.png" alt="Movie Genre Preferences by Gender">
<p class="caption">Movie Genre Preferences by Gender</p>
<p class="caption">不同性别的电影类型偏好对比</p>
</div>
<div class="figure">
<img src="preference_analysis/age_genre_heatmap.png" alt="Movie Genre Preferences by Age Group">
<p class="caption">Movie Genre Preferences by Age Group</p>
<p class="caption">不同年龄段的电影类型偏好对比</p>
</div>
<div class="figure">
<img src="preference_analysis/age_year_heatmap.png" alt="Preferences for Movies by Decade Across Age Groups">
<p class="caption">Preferences for Movies by Decade Across Age Groups</p>
<p class="caption">不同年龄段对不同年代电影的偏好</p>
</div>
<div class="figure">
<img src="preference_analysis/gender_avg_rating.png" alt="Average Rating by Gender">
<p class="caption">Average Rating by Gender</p>
<p class="caption">性别与平均评分的关系</p>
</div>
<h2>Conclusions and Insights</h2>
<p>Through in-depth analysis of the MovieLens dataset, we found significant correlations between user characteristics (gender, age, occupation) and movie preferences. Key findings include:</p>
<h2>结论与洞察</h2>
<p>通过对MovieLens数据集的深入分析我们发现了用户特征如性别、年龄、职业与电影偏好之间存在显著关联。主要结论包括</p>
<ul>
<li>Significant differences in movie genre preferences between genders</li>
<li>Age influences how users rate movies from different decades</li>
<li>Occupational background correlates with genre preferences</li>
<li>不同性别用户在电影类型偏好上存在明显差异</li>
<li>年龄因素会影响用户对不同年代电影的评价</li>
<li>职业背景与电影类型偏好具有相关性</li>
</ul>
<p>These findings provide valuable reference for designing movie recommendation systems and developing movie marketing strategies.</p>
<p>这些发现对于电影推荐系统的设计和电影营销策略制定具有重要参考价值。</p>
</div>
</body>
</html>