钟喜佳

总结笔记:用Python+Web前端做个dashboard

不知不觉,在b站(哔哩哔哩)做了一段时间的萌新up主了。(虽然是萌新,但是还是有颗想当大V的心的)古人云:工欲善其事,必先利其器。“器”在这里我定义为数据分析,用数据驱动产品。

哔哩哔哩的创作中心数据趋势只展示七天以内的数据。于是萌生出写一个dashboard,把哔哩哔哩和YouTube的数据都放在mysql,用dashboard展示,还能从mysql拉历史数据,从不同纬度进行数据分析。

最下边有完整源码的github地址

第一个版本的demo长这样:


开始写dashboard

图表的生成用到了pyecharts的库,官方提供了个生成html的render()方法和生成多图的page()方法,但是只能固定排列样式。temp传入后,要多图也只能for循环,不能固定样式。

于是,到pyecharts源码,添加了个render_html_content()方法,作用是,生成一个包含图表html代码的对象。然后就可以和python3,format()方法结合生成index.html

注:web前端用了开源的keen/dashboards

format()技巧:

例如我们在html_temp.html写个{all},读取html_temp内容作为obj对象


然后obj.format(all=all),右边的all是html内容,执行后,就可以把{all}替换为html内容。

pyecharts源码修改:

(如何找到pip3安装的第三方库的地址?)

运行python3,

import pyecharts

pyecharts

就有地址了


修改render/engine.py


并在render类添加个render_html_content()函数

def render_html_content(self, template_name: str, chart: Any, path: str, **kwargs): tpl = self.env.get_template(template_name) html = utils.replace_placeholder( tpl.render(chart=self.generate_js_link(chart), **kwargs) ) return html

修改charts/base.py

image

并在base类添加个render_html_content()函数

def render_html_content( self, path: str = "render.html", template_name: str = "simple_chart.html", env: Optional[Environment] = None, **kwargs, ) -> str: self._prepare_render() return engine.render_html_content(self, path, template_name, env, **kwargs) 

在render/templates添加个temp.html

{% import 'macro' as macro %} {{ macro.render_chart_content(chart) }} 

这个temp.html 去掉了<html></html>等标签,只输出图表的html代码

这时只需要调用

图表对象.render_html_content(template_name="temp.html")

html_obj.format(all=all)

就可以把图表html替换掉模版html的{all}

获取哔哩哔哩数据:

光有图表没数据可不行

以下几个公开的API可以获取播放量、粉丝数、点赞等数据

https://api.bilibili.com/x/relation/stat?vmid=哔哩哔哩id

https://api.bilibili.com/x/space/upstat?mid=哔哩哔哩id

http://api.bilibili.com/x/space/navnum?mid=哔哩哔哩id

我们可以先建一个bilibili表,然后把数据插入进去

表结构为:

CREATE TABLE bilibili ( id int(8) unsigned NOT NULL AUTO_INCREMENT, view int(9) NOT NULL COMMENT '播放总数', follower int(9) NOT NULL COMMENT '被关注数', likes int(9) NOT NULL COMMENT '点赞数', video_count int(9) NOT NULL COMMENT '视频数', PRIMARY KEY (id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; 

获取数据的python脚本:

#!/usr/bin/env python # -*- coding: utf-8 -*- import os,requests,json,pymysql class spider(object): """docstring for zs_spider""" def __init__(self): # create connection object self.conn = pymysql.connect(host='192.168.28.140',port=3306,user='test',passwd='test123',db='test',charset='utf8') self.cursor = self.conn.cursor() self.headers = { "user-agent": "Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp)", "referer":"https://space.bilibili.com/164106011/video", } self.vmid = "164106011" def __del__(self): # close connection object self.cursor.close() self.conn.close() def insert_testdata(self): sql = """select count(*) from bilibili;""" self.cursor.execute(sql) countNum = self.cursor.fetchall()[0][0] if countNum <= 5: for i in range(5 - countNum): self.insert_to_database(1000*i,10*i,10*i,1*i) self.conn.commit() print("已插入测试数据") def insert_to_database(self,view,follower,likes,video_count): #  sql = """INSERT INTO bilibili (view,follower,likes,video_count) VALUES ( %d, %d,%d, %d) """ data = (view,follower,likes,video_count) self.cursor.execute(sql % data) print("已插入今日数据") def select_data(self): sql = """select * from bilibili order by id DESC limit 6;""" self.cursor.execute(sql) return self.cursor.fetchall() def spider_get_data(self): follower = json.loads(requests.get("https://api.bilibili.com/x/relation/stat?vmid="+self.vmid,headers=self.headers).text)["data"]["follower"] upstat = json.loads(requests.get("https://api.bilibili.com/x/space/upstat?mid="+self.vmid,headers=self.headers).text)["data"] view = upstat["archive"]["view"] likes = upstat["likes"] video_count = json.loads(requests.get("http://api.bilibili.com/x/space/navnum?mid="+self.vmid,headers=self.headers).text)["data"]["video"] self.insert_to_database(view,follower,likes,video_count) self.conn.commit() def main(): bilibili = spider() # bilibili.spider() if __name__ == '__main__': main() 

Python获取近五天日期的列表:

import datetime def get_date(): date = list() for i in range(5): date.append((datetime.date.today() + datetime.timedelta(days = -i)).strftime("%m月%d日")) return date
date = get_date()[::-1] # 获取五天的日期 

最后附上生成dashboard脚本:

#!/usr/bin/env python # -*- coding: utf-8 -*- from pyecharts.faker import Faker from pyecharts import options as opts from pyecharts.charts import Pie,Page,Line from pyecharts.globals import ThemeType import get_data import datetime with open("index_temp.html","r") as f: f.readline().rstrip("\n      bg") index_content = f.read() f.close() def line_center(width,height,title,date,view): c = ( Line(init_opts=opts.InitOpts(theme=ThemeType.CHALK,width=width,height=height)) .add_xaxis(date) .add_yaxis("哔哩哔哩", view) # .add_yaxis("YouTube", [3,2,55,4,5]) .set_series_opts( areastyle_opts=opts.AreaStyleOpts(opacity=0.5), label_opts=opts.LabelOpts(is_show=True), ) .set_global_opts( xaxis_opts=opts.AxisOpts( axistick_opts=opts.AxisTickOpts(is_align_with_label=True), is_scale=False, boundary_gap=False, ), ) ) return c def line_left(width,height,title,date,data): c = ( Line(init_opts=opts.InitOpts(theme=ThemeType.CHALK,width=width,height=height)) .add_xaxis(date) .add_yaxis(title, data) .set_series_opts( areastyle_opts=opts.AreaStyleOpts(opacity=0.5), label_opts=opts.LabelOpts(is_show=True), ) .set_global_opts( yaxis_opts=opts.AxisOpts(name="单位:/千人", axislabel_opts=opts.LabelOpts(formatter="{value} K"), ), ) ) return c def line_right(width,height,title,date,data): c = ( Line(init_opts=opts.InitOpts(theme=ThemeType.CHALK,width=width,height=height)) .add_xaxis(date) .add_yaxis(title, data) .set_series_opts( areastyle_opts=opts.AreaStyleOpts(opacity=0.5), label_opts=opts.LabelOpts(is_show=True), ) ) return c def bottom_all(width,height,title,date,view,follower,likes,video_count): c = ( Line(init_opts=opts.InitOpts(theme=ThemeType.CHALK,width=width,height=height)) .add_xaxis(date) .add_yaxis( series_name="被关注数", stack="总量", y_axis=follower, areastyle_opts=opts.AreaStyleOpts(opacity=0.5), label_opts=opts.LabelOpts(is_show=False), ) .add_yaxis( series_name="点赞数", stack="总量", y_axis=likes, areastyle_opts=opts.AreaStyleOpts(opacity=0.5), label_opts=opts.LabelOpts(is_show=False), ) .add_yaxis( series_name="视频总数", stack="总量", y_axis=video_count, areastyle_opts=opts.AreaStyleOpts(opacity=0.5), label_opts=opts.LabelOpts(is_show=False), ) .add_yaxis( series_name="播放总数", stack="总量", y_axis=view, areastyle_opts=opts.AreaStyleOpts(opacity=0.5), label_opts=opts.LabelOpts(is_show=False), ) .set_global_opts( tooltip_opts=opts.TooltipOpts(trigger="axis", axis_pointer_type="cross"), yaxis_opts=opts.AxisOpts( type_="value", axistick_opts=opts.AxisTickOpts(is_show=True), splitline_opts=opts.SplitLineOpts(is_show=True), ), xaxis_opts=opts.AxisOpts(type_="category", boundary_gap=False), ) ) return c def get_date(): date = list() for i in range(5): date.append((datetime.date.today() + datetime.timedelta(days = -i)).strftime("%m月%d日")) return date def write_html_to_file(format_content): with open("index.html","w+") as f: f.write(format_content) f.close def main(): get_data.spider().insert_testdata() #如果数据不存在,插入前五天的测试数据 date = get_date()[::-1] # 获取五天的日期 # get_data.spider().spider_get_data() data = get_data.spider().select_data()[::-1] # 爬取哔哩哔哩 用户数据 view = [x[1] for x in data[1:]] # 从用户数据提取 播放数 follower = [x[2] for x in data[1:]] # 从用户数据提取 关注数 likes =[x[3] for x in data[1:]] # 从用户数据提取 点赞数 video_count =[x[4] for x in data[1:]] # 从用户数据提取 视频播放数 view_six_day = [x[1] for x in data] view_sub = [(view_six_day[x+1]-view_six_day[x])/1000 for x in range(len(view_six_day)-1)] follower_six_day = [x[2] for x in data] follower_sub = [follower_six_day[x+1]-follower_six_day[x] for x in range(len(follower_six_day)-1)] # 开始画图并生成html # "256px","325px" all = line_center("533px","325px","总曝光量",date,view).render_html_content(template_name="temp.html") line_left_bilibili = line_left("310px","325px","新增播放",date,view_sub).render_html_content(template_name="temp.html") line_right_bilibili = line_right("310px","325px","新增关注",date,follower_sub).render_html_content(template_name="temp.html") bottom = bottom_all("1226px","600px","新增数",date,view,follower,likes,video_count).render_html_content(template_name="temp.html") format_content = index_content.format(all=all,line_left_bilibili=line_left_bilibili,line_right_bilibili=line_right_bilibili,bottom_all=bottom) print(all) write_html_to_file(format_content) print("index.html生成成功") if __name__ == '__main__': main() 

完整源码的github地址:

https://github.com/guyuxiu/project

参考文献:

  1. 【pyechart文档】https://pyecharts.org/

  2. 【dashboards源码】https://github.com/keen/dashboards

  3. 【哔哩哔哩 API】https://github.com/SocialSisterYi/bilibili-API-collect/








码字很辛苦,转载请注明来自.钟喜佳笔记 .《总结笔记:用Python+Web前端做个dashboard 》