[Acfun]正则战喷刷

随着Acfun的流行度增加,各种各样的喷刷弹幕也更多地占领大家的屏幕阵地,目前主要的手段是回喷反击和重复建议猴子搞弹幕实名制之类的(猴子从来不理),个人觉得第一种只会继续降低围观群众的体验,而且这些反喷刷弹幕极难过滤,建议大家不要这样做;第二种会杀掉Acfun的流量,所以就算猴子不搭理我也觉得可以理解。

好在弹幕过滤支持正则表达式了:前段时间做ac娘工具让我对正则表达式有了一定的理解,不过高阶的引用、断言我还是一知半解;这回为了战喷刷,我一次又一次地认真阅读介绍正则表达式的文章,终于基本上弄清楚这两者的意思了,从而继续改进我的过滤弹幕用正则表达式。

战喷刷的指导原则:尽可能减少弹幕,但绝不怕误杀;汉语博大精深,误杀一部分也没关系,我们还有一大片吐槽森林!

继续阅读

远离底层

要为actools添加用户收藏功能。按照原来的开发方法的话,要完成这个功能大约是2~3天的时间,不过我并不急于开发出这个功能,而且受到以下事件的影响:

  • bilibili.us强势开张,证件齐全还提供rss,而acfun的管理现状让人担忧,不想随意地投入精力到里面;
  • 本地的开发和调试环境,包括eclipse PHP、LAMP等最近才配置好,在配置好之前,无法实现增量开发,导致开发动力下降;
  • 在思考OO的过程中不断产生疑问,对现有的代码感到不满,并花费了大量时间去修改;
  • 发现了一些奇怪的php语法(例如@和&),然后受到打击停滞不前;
  • 毫无压力,喜欢就做一下,不喜欢就玩一下。

于是一拖再拖。作为一个业余得不能再业余的web编程人士,我试图通过看别人的优秀设计来获得长进;我深知不可能看几篇速成就能成为经常鄙视别人和被人鄙视的砖家,不过我也不计划成为那样的人,去和别人争论效率啊思想啊什么的,只要能过自己心里认为过得去就行而目前的设计方法连自己都看不下去,所以才会有动力去研究,去思考。

目前的情况就是,我不敢轻易地从底层的东西开始写,而打算在一个足够开放的轻量级框架下拼凑他人写好的代码;拼凑不只是简单的c&p,我还是有把握拓展他人的代码使其和现有框架接合的,但总体来说,除了独特的功能例如fetch之外,通常的工作例如用户管理和一般的数据库操作,我都希望尽可能用别人的东西,因为,例如数据库查询,我只能想出『传递sql返回查询结果』这种级别的封装,从思想上就很难做到Active Record级别的考虑。

于是远离底层。

《网瘾战争》:感到有义务传播

我不玩WOW,但这不妨碍我认为这是好作品;看完全片,我最直接的感受就是,比起只在微博上广为流传的冷嘲热讽,热血的话语更能感染人;尤其喜欢众人保持沉默,但用钟声传递力量那段,因为这样的话,此片就不会被别有用心的人评价为“由落后阶级领导的抗争、是农民阶级的幻想”,从而更加贴近现实的诉求;Ac上的有名职人亦参与其中,关键是还有荼荼丸大人献身

感觉需要WOW、Acfun和09年非主流时事基础才能看明白,以本人的水平还是基本懂了。

NGA原出处

Acfun版本,荼荼丸大人献声的一小段被砍了,建议下高清版。

高清版:纳米盘|电驴

《网瘾战争》完全手册 by @duck_1984

原声带

ac娘工具:数据库从sqlite移到mysql

想来想去,与其先把sqlite数据库搞下来、再找个奇怪的转换工具转为mysql再弄上去,还不如直接在远程服务器上弄,注意备份就是了。

于是赶紧写了个python去搞,参考了一些资料很快就写好了。由于先前很少接触mysql,所以很多设置例如引擎啊、字符编码啊都没设置好,后来在出现乱码、又发现不能按照事务方式执行的时候,才知道应该选InnoDB引擎,然后所有地方都设置成utf8。

mysql比起sqlite真的是好太多了:习惯了用access的我,一直都以为mysql啊mssql啊这种不能作为文件随便挪的数据库非常不方便,可是现在才发现,有phpmyadmin这种强大的存在的话,mysql比要写python脚本才能好好地访问、还要设计现实界面不然就只是显示出一堆难懂的数组的sqlite好用多了;而且在有cpanel的情况下,备份也比sqlite方便。最重要的是,sqlite还分2和3,而python用的是3、php的CI用的是2,移植相当麻烦,哎。

另外,我又开始犹豫到底要不要做php的ac娘工具了:用python做出来的程序感觉快很多,特别是换了mysql后更快了,这也许和python脚本的执行效率有关,毕竟py要755权限才能执行,而php只要644。作为别人的网站的辅助工具,其实能用就行吧。

移植的时候对数据结构进行了重新设计,充分利用了mysql的特殊数据类型:分类字段就用了enum代替smallint,date和time也用了标准的时间和日期格式,所以原来的python程序除了改变数据库接口和表名之外,还有很多地方要修改,总体来说整个程序又优化了一点,而且还有优化的余地;此外,对自动抓取数据的cron也进行了修改,通过sys.path.append的方式来实现自建库的复用(原先是将几个库的代码全部拷到同一个文件中用:这已经超越了面向过程的编程,达到了汇编语言式的编程了。豆知识:cron,Linux中的时钟守护作业,其中一个特点是从根目录开始执行程序,所以不能够简单地import程序同目录下的其它库)。

折腾到这个程度,相当有成就感。

php版ac娘工具的进展

目前对用户页面(搜索页、重定向页)已经完成,只要完成了最关键的update模块,再把原来sqlite的数据导过来,就可以放出去给人用了。

遇到的问题:

  1. CI架构的库和辅助函数是比较全面的,而且想着越来越通用的方向努力地发展着,可是我还是不得不打破了它的一些基础设定,例如search功能最终还是使用了CI不推荐的get方法;在做这个决定之前,我参考了许多大型应用,如豆瓣、google和百度,它们都是用get方法进行搜索,看来这个是无法避免的问题。
  2. 库和辅助函数的最佳载入点应该在哪里?我现在是需要的时候才载入,这样可以使得一个模块对其它模块的依赖性降低,但是整个程序变得挺难看的,东一块西一块。
  3. 对MVC架构将模型和视图分开的考虑还不够,数据查出来就直接放到view中使用了。
  4. 目前是将每个view的内容全部用echo命令输出,但事实上view应该写成html中嵌入php代码还是全部用echo好呢?尽管html模板的可读性更强,可是php程序的效率应该高一些;如果学习某种模板语言,也许像分页功能这种就又成为一个新问题了。
  5. 比起原来的python程序,php速度变慢了很多,暂时未知瓶颈在哪里。

Acfun Approved by Daiwan Crowd

It’s controversial so I won’t translate it into Chinese.

Daiwan is laughingly known. I learned it from Uncyclopedia, which is fun but blocked.╮(╯_╰)╭

Well, I’m sometimes idle enough to poke on Wikipedia. I do care for whether Acfun’s approved by it, because Acfun was once arise on Wikipedia about a year ago, but deleted immediately for the reason that it was an article of company profile, commercial or website. It was comprehensive, but it hurt. I did not revolt it, because I was not familiar with Wiki and did not want to confront the Daiwan Crowd who was respecting and maybe had occupied Wiki for their mainland compatriot. They did not know how important Acfun was in the mono-cultural mainland, all they knew about Acfun was just a knockoff of Nicovideo.

Now I think it approved. When I visited the page of Acfun just, I only found on the head of the article “more referrence needed to prove attention” rather than such deleted or to be deleted or controversial for any reasons. Glad that it was approved.

Maybe the system of Wiki should and do eventually reveal the value of Acfun, but I simply think that the popularized JINKELA derivative work directly helped. When I searched JINKELA from blogs in Chinese Traditional, I found many who was struck by its original ads, and thought highly of the drivative work finished by 221, hank or other shokunins. We understand each other when satirizing.

As mainland IP addresses are still being blocked by tw.nico, and the mainstream media of both sides are still smearing the other, we have a looooooooooooong way to get totally understood. I’m trying my best to comprehen, for instance, I’m concerned with Lucifer Chu. But I’m not yet ready for opposing voices on the other site, what a shame. The mono-culture kills someone.

Python Helps Python帮上忙了

In my feed updated, there is someone who has grasped all the VeryCD data and made a knock-off abroad. He spent a week on it, applying Python for crawler and web, cron job (maybe to grasp regularly) and so forth, oops!

在我的供稿更新中,有人抓了vc的全部数据然后做了一个山寨版放在国外。他花了一个星期用Python做了爬虫和网站,也用到了时钟守护任务(也许是定期抓数据)等。~\(≧▽≦)/~

I’m doing the same for Acfun! But what he did must be much greater. What’s more, I only indexed Acfun, but he duplicated VeryCD. What makes sense lies in that we both used Python. The urllib2 is a good guy.

我也正在为ac做同样的事!不过他做的事伟大得多。此外,我只是做了ac的索引,而他复制了vc。比较有意义的在于我们都用了Python。urllib2是好东西。

Python’s excellent designed, I found as I slightly dived in to it. It’s not easy for me whose foundation of coding is bad to develop well-structured applet, but I still want to continue because I expect that it will benefit.

Python的设计非常赞,这是我稍微深入了解了一下之后发现的。但对于没有良好的写代码基础的我来说,写一个结构良好的小程序很不容易,不过我将继续努力,因为我希望它能带来好处。

Developing Search Tool for Acfun

When the self-owned search function was cancel by the monkey, I planned to develop this tool. After thinking and strolling in my room for 2 hours, I thought I had known what to do and started to make it.

当猿把ac娘自带的搜索引擎吞了,我便开始计划做这个工具。经过2小时的构想和到处乱走,我感到自己已经知道要怎样弄了,于是开工。

Python, which I fought for this semi-finished Acfun Search Tool with, was new for me 3 days ago. I had a chance to choose a script to develop the tool, for I’ve know nothing about technique newer than JAVA. Finally I chose Python, because:

我花了3天完成现在的半成品,用的是Python,一种从来没用过的语言。我可以考虑用任何脚本语言,反正所有比JAVA要新的语言我都不会用。用Python的原因是:

  1. It worked immediately after it was installed without any other configurations.
  2. It ran as a double-click rather than “python yourname.py<enter>”, like VBScript. It is much shorter, isn’t it?
  3. For the reality. Google App Engine has supported it, and I meant to put the tool there to benefit from the powerful Google. Later I realized that GAE has maximum of 1M for data-transfer, so I changed my hosting place.
  1. 装好以后就可以用了,不用任何附加设置。
  2. 程序文件双击就可以执行,像VBS一样,不需要用命令行执行。对我来说这很重要,因为够简单,哈哈。
  3. 实际上是因为Google App Engine支持它。原本我打算在GAE上放这个程序,以便利用实力强大的Google,但我很快发现GAE只支持1M以内的数据读写,于是我改放到其它免费空间中。

At the very start, I planned to fecth all the data from acfun.cn first, keep it up to date, and built a RSS feed for it so others can secondary exploite it, but I decide to finish the search function first. I spent 2 days grabing datas from Acfun, modifying my crawler to suit to the irregular but met frequently data for several times, and change the data-structure from CSV file to SQLite database(CSV has been used for GAE). When Acfun became busy at prime time, I had to pause the crawler.

最初我打算先抓取全部数据,然后制作RSS,那样其他人也可以用这些数据来二次开发,但我决定还是先做出搜索功能。花了2天去抓数据,途中对爬虫进行了几次修改,又将数据格式从csv改成SQLite数据库(当初考虑用csv是因为GAE)。当Acfun进入访问黄金时间,就只能暂停抓取。

Finally the crawler accompalished its mission and was reformed to keep the data new. I’ve been aware of Python web frameworks such as django, but I decided to use CGI script, because it’s much easier for me who knows nothing about frameworks or MVCs or MVTs. When I directly use the hosting to test my scripts, my work increasing geometrically. Now it works, although hardly with user-interface.

数据抓完后,我便将爬虫改成数据更新器。虽然留意了很多Python的web架构,但我完全不懂那些东西,所以还是决定用最简单的Python CGI Scripts。因为要在远程服务器上调试,所以工作量非常大,无论如何现在是弄好了,但几乎没有任何用户界面……能用就行。

I’ve published to TIEBA of Acfun so as to ease maybe some load of Acfun Server. I hope people will benefit from it, and I have to go back to revise now. Visit @ http://illustrate.heliohost.org/ac.htm.

已经在Acfun吧发布了,希望能够帮到人并且减轻Ac娘的负担,我要回到复习状态……访问上面的网址就可以使用这个工具。

what is acfun

Important notice: Before reading this article, make sure you CAN ignore the wrong-used or unsuitable words and grammars of the article without puking.

I want to say “Change!” like someone who has been famous, but not for famous. In fact, as I’ve realized that the blog is deserted, I decide to change, mostly about its content. Also, I propose to change its main language into English(I’ve found that Google Toolbar has automatically correctted the “in” to “into”, that’s great), because I have had to pratice my poor English, especially English-writing, for GRE(of course, the Chinese one). Another reason is that I’m planning to write some comments or reviews about ACFUN, which is controversial. I just don’t want to argue about anything of ACFUN with anyone else, but to discover its value-how it provides us with joy.

ACFUN is a private video sharing website with comments are overlaid directly onto the video, synced to a specific playback time. ACFUN used to be considerred as “Shanzhai”(knockoff) of Nicovideo, but has developped its unique cultures, although its development is inevitably influenced by Nicovideo.

It is worth mentioning that ACFUN probably means “Anime, Comic Fun”, but now it has various videos related to not only animes, comics and other subjects popular in Nicovideo like touhou, Doujins, Vocaloids, MADs, Kichikus(鬼畜), but also popular/funny incidents happened in China like lv8niang, geping who is character voice of Lan Mao(or Blue Cat) and so on. Unlike other common video websites, videos are produced not just for delivery, I can ever say, not for delivery. The origin material are reproduced, usually falsified, for example, an short sound clip are repeated more than 100 times to make a brainwashing effect(an method defined as Kichikus). Once these crazy works are published, viewers will surely be excited, add mad comments that are overlaid directly onto the videos, so later comers will be stimulated by videos themselves and the comments rolling over the screen.

Those who produce these videos are called “meister”,which is a term undoubtly from Nicovideo. There are kinds of meisters: the general ones who produce videos originally, using professional media software; meisters who “carry” wonderful videos from Nicovideo or other video websites; meister, and those who supply interpreting services voluntarily(they have been characteristic feature of ACFUN, usually considerred great, because they enable those who know little about Japanese to enjoy a-c-fun from Japan). Also there are content creators or deliverers who provide substantive contents to the website, for example, singers, painters or game players, but consider that it is not other carriers but videos which finally present contents, meisters are always the most respective, let alone meisters who even create contents himself/herself.

Here are some famous meisters: tutuwan(two-two-one, or 221,荼荼丸)、beijizhu、Hank、Nini and so on. They have produced countless works, such as “For the Holy Harmony”, the “Chief’s anger” series and the “Jin Ke La” series. Many meisters have united and established groups of meisters, such as “Carriers Office #⑨”, which has enjoyed “acfunwide” prestige.

ACFUN has been somewhat popular in ACG field, but with its popularity, there have been complaints about its meaningless and offensive comments and controversies about its development orientation or direction. Anyway, maybe it is better for ACFUN to have diversification, even though confliction and attacks are inevitable. Besides, it would be too much for us to lay our development hopes on an individual website in the threat of overloading and potential “Harmony”. It will be providing us with joy and that’s enough.