清心站长部落空间: General

显示标签为“General”的博文。显示所有博文

2009年8月22日星期六

Web 3.0 Guess: Natural Language Processing

Now the web trend we arrived at Web 2.0. This the version of Web is the common sense we know that how our web looks like.

Web 1.0: The web is driven by data and information provided by webmaster. This is simplex transmission (one-way).

Web 2.0: The web is driven by data and information provided by both webmaster and users, as well as their interaction and communication. This is the full duplex transmission (two-way).

No longer than one decade, as many as peoples started talking about Web 3.0, how the Web 3.0 looks like.

Web 3.0 will have emerging technologies to drive it. User centralization is one of the concept. Despite of Web 2.0 that is the data-driven world, Web 3.0 will be user-driven world. Every web user will have unique identity that can acknowledges themselves to the other. Everything become easier, everything become possible.

From the user perspective: the user should be able to utilize whatever service to accomplish their tasks even need not know where the services and how the services provided the solutions. The user should be able to search something that is driven by broken piece memory, they even need not know what truly the name of the object as well as the visible characteristic of the object. The user should be able to get the direct suggestion or the guidance from the web but not getting the series of BLUE link with BLACK description word anymore.

From the web perspective: the web may become intelligent, they can learning like the human, they know things based on the experience. They can store these knowledge and talk to the human, provide knowledge to the human, accomplish the user tasks. They can have automation mechanism to drive on the technology but not have the user driver at the back. They make our world become easier and delight. They may act like an expert to provide expertise suggestion or guidance to the user to guide them to find the one they want.

When the web become clever, he can interpret the user's languages, and interpret what the user really want, and finding the solution to the user, to help the user accomplish their tasks. Thus, there is an AI area to investigate this problem, we called natural language processing. Natural language processing will be the important concept we need to investigate, to find the regular pattern how the natural languages can be changed, how the normal user's brain to interpret the natural language, and become their knowledge.

We can think like a baby who just born to the world, he really do not know what the world is, what the language is. But all of us we are evolving from baby, from everything we don't know become know everything. The process we gather first information by the language talked by our mother and father, one word by one word, how our brain store the word in the smaller piece neuron? How we getting out the word stored in our brain to form the meaningful sentences when we tried to interpret the other information, to find the relevant and similar information, tried to understand it and make the respond? Actually, how we form our sentences of the structure? As we become experienced, everything we think that is easier. But how at one times we have faced accident and lost some memory? Yes, the cache of the experience is missing, we have failed to interpret something we supposed to know but now we don't know, or not remember that such thing. Lastly, the problem how to investigate the natural language processing is already started by the computer scientist. Think about it. And, say, how the application of natural language processing is important to Web 3.0?

When the web can interpret our natural languages, everything really become easier. Before that many friends asked me how to install the specified Discuz! plugins (Discuz! is the forum script developed by China company). I will ask them to goggled the tutorial, but, they in turn asked me how to write the keyword. I think this is annoying stuff for now. But if we can just typing a sentence, what we really don't know, then in turn we write the natural sentence what we really don't know to the textbox of the search engine, and then the search engine return the specified suggestion to us, but not the long list of the links that we supposed to annoy again. For example, he or she can write this, "I want to know how to install Discuz! Bank Plugins".

And we know about the expert system, that act likes getting human expertise to store in the knowledge base and in turn solve the human problems. But we know expert system only can specified to one narrow domain. From my view, of the evolution of the investigation of Web 3.0 had started, how to form the semantic web to categorize the information and the knowledge around the web, in turn to help in natural language processing and interpretation. I think someone can be one stop intelligent search engine, that act likes general expert system. And that expert system can consult knowledge from the other narrower expert system to find really the user want. As well as based on this logic, no one can rule the world but cooperation to deliver something that is good for the people in the coming days.

2008年12月29日星期一

增加 Qxinnet Search in my blog

Just implementing Qxinnet Search (http://search.qxinnet.com) in my blog (http://fyhao.qxinnet.com).

At first, you can choose to access the searching page (web, video, image) at the top right, or using the blog Search.

However, after you search something keyword, you will be redirected to this page, for example.

When you search "Qxinnet" as your keyword, you will find this on page.

You can directly click the link below "Search Results for", for example, web page for Qxinnet, video for Qxinnet, images for Qxinnet.

It is very convenient I think if you want find something more important.

Thanks support.

Fyhao New Blog Site

I have moved my WordPress blog from http://fyhao.wordpress.com to http://fyhao.qxinnet.com. Welcome you all go to support it.

For that blog, it will be the mainly news from me or Qxinnet news, however will also talk some Information Technology news, and such information about my study. I also will post some experience and tutorial on how to achieve scholarship because I had got a hard experience to achieve on it. I hope the other people can have less wrong road to apply for a scholarship or any study information, for example go to university or college in Malaysia.

I think I will write my feeling and also the views in that blog later time. I will mainly using English, but also using Chinese to write my blog.

For this blog http://fyhao.blogspot.com, I decided I will going to continue to do it, because I have already had a love touched with it.

Giving me a comment if you have any suggestion, any views when see my works. I will try my best to make it better than ever.

Thanks very much.

2008年12月27日星期六

清心图片搜索

增加了搜索图片的功能，模仿 FaceSearch 使用 Protoflow 技术制作了炫丽的搜索效果，图片使用动态Flash的方式呈现出来，看了感觉很爽。也花了一段时间来研究 Google Image Search API，为了解决返回图片结果数量的问题。

http://search.qxinnet.com/image.html

Google的免费API们

这星期抽空研究了一些网站提供给开发者的免费API——这貌似是最近比较流行的做法。既方便了别人在自己资源上的二次开发，也同时不花一分钱就有了无数稳定的免费广告和流量，以及创意。当然，前提是自己的数据有足够吸引力，值得别人使用。尝试了几个感兴趣的API后，发觉是各有千秋，但总的来说，功能上大多还是不尽善尽美的。想来也是，既然是免费的午餐，点到即止，让人意犹未尽才是最美妙的做法。
网页搜索：
功能上说最强的显然当之无愧要属于Google的Co-op自定义搜索引擎了，不过这并不属于API的范围，因为用户要做的只是配置而已，真正可供开发的很少。简单地说Co-op提供了一个可以自己定义范围，定义Onebox的搜索引擎，使用Google的页面，也可以用iframe嵌入到自己的页面中。也正因为它是iframe的嵌入，所以局限很大，不能自定义css(默认的设置可以换颜色)，不能定义模块位置和显示，同时在使用的时候也有一些无法避免的小问题，比如点击相关搜索之后，搜索框里的关键字不会变化等等。本站右边导航栏就有一个站内搜索框用的Co-op，可以试用。对于这样的站内搜索来说，这个功能已经足够了。如果要真正自定义的话，那也不是不可以，比如像腾讯的soso那样，只要你是Google的合作伙伴……
Google的AJAX Search API一定程度上和Co-op有点互补，嵌入在页面里。尽管只提供两种界面，但是理论上可以自己获得所有的GResult对象然后自己渲染，甚至由于其基于Java Script，只要有足够长的时间和足够的耐心把源码看明白，就可以做更多的修改。不过其最大的致命点在于每次只能提供最多8个搜索结果，还不能翻页——所以无法替代Co-op的作用。另外，Web Search API是可以用自己的Co-op定义的站点列表的，这点上两者结合也许能做出亮点。
相片：
Google的AJAX Search API是支持图像的，优缺点前面都说了，外加一个不支持Co-op就差不多了。实用性不是很大。
Flickr的API功能极其强大，可以说只要你愿意，就可以用这些API搭出一个和Flickr本身几乎一样功能的网站来——当然用的是Flickr的资源。不过鉴于Flickr被和谐了，所以这些功能墙内用户是享受不到了(其实都可以用，只是显示不出图片而已)。Flickr的API调用返回结果是XML格式的，完全可以任意发挥。作为我最关心的相片搜索，Flickr支持多个关键字的与、或，支持多种排序方式，总之就是很强大了。不过美中不足的是，有的时候我想在页面嵌入一些风景照片，结果相关最大的都是有人的照片。话说Google的图像搜索已经支持人脸识别，可以只显示带人脸的，那么Flickr如果能来个反其道而行之，可以选择只显示不带人脸的，就太美妙了。
国内的Flickr仿照者Yupoo和bababian也提供了一些类似Flickr的API，只不过非常少。而且作为关键字搜索，居然没有相关度排序而只有时间排序，也许更多的是技术跟不上吧。
Picasaweb前阵子推出了一套API，没有仔细研究过，看上去主要是给用户操作自己相片的，至少不支持搜索。相信以Google的风格，将来Picasa也会有一套不逊于Flickr的API的。
Panoramio的API支持从它那里获取一个地理范围内的照片，返回JSON格式的列表，单独看用处不大，但是结合其他例如地图API，则可有无穷想像力。
地图：
毫无疑问又是Google的Maps API。已经有了无数的应用，就不多废话了。我一直很看好结合时间空间的数据组织和内容共享，不过现在看来这方面的潜力还远远没有发挥出来，唯一一个看着像点样子的Panoramio也已经被Google收购了——事实上这只是这类应用的牛刀小试而已。可惜鉴于中国政府对地图信息莫名的敏感程度，这样的应用要在中国做好难上加难。不扯远了，回到API上，Map API本身似乎不支持搜索，否则用来复制一个Google Maps也轻而易举了。
国内的我要地图网也提供了API接口，包括标注，搜索等，不如Google的强大，不过门槛相应要低得多。
Google的AJAX Search API带了Local Search API，单独看可用性不强，不过也许可以和Maps API整合。不支持中国数据。
博客搜索：
再次提出Google的AJAX Search API，带专门的Blog Bar，我页面里就放了一个，可以参考。
Technorati提供了关键字Search的API，返回XML，看上去很美，可是其对中文搜索的相关性处理得很不好，质量不佳。另外，很诡异的是他的API每天有500次访问的限制，实在想不出来500次能做什么应用……
值得注意的是Technorati和Google Blogsearch都支持搜索结果的rss订阅，如果流量不是很大的话，凑合着抓来也能用，虽然不是很厚道。
可以看到国外的巨头们都很注意与开发者的交流，都提供了免费API给用户使用，实现双赢。而国内则基本都是Web 2.0的小公司在做这样有意义的事情(包括一些简单的嵌入式Java Script实现)，大公司则想着法子用诸如图片不让外链等方式恶心用户。肉食者鄙，未能远谋，真可谓余音绕梁，千年不绝。

Facesaerch: 基于Google API的人脸搜索引擎

四月份的时候Chada介绍过一个直观化的图片搜索引擎CreativSpace，前两天在后台发现了来自该搜索引擎作者的Franz Enzenhofer的评论，他又写出了另一款很有特色的搜索引擎——Facesaerch，专门用于人脸搜索的引擎。

facesaerch

Facesaerch和CreativSpace都是基于Google API创建的，搜索的图片均来自于Google，一黑一白看起来就像是兄弟俩。当然，采用的技术也是相同的。用Photoflow脚本动态显示搜索结果，感觉非常棒。呵呵，我是说当你用他来搜索美女的时候，一张一张动态展示的感觉。当然，为了不引起Miya大人的误会，Chada这里的截图为James Blunt的搜索结果。

face search

如果你和Chada一样在Firefox里安装了Piclens脚本特效扩展的话，你还可以打开它然后享受下面这种高清视觉冲击的画廊效果。

facesaerch piclens

点击这里进入Facesaerch>>

需要说明的是，上面的截图是在Firefox下的，IE下效果差了很多。

Google音乐搜索界面曝光博客测试搜索效果(图)

【搜狐IT消息】8月5日消息，以专注于报道Google产品的博友与G共舞在博客上公布了Google音乐搜索界面，并进行了简单测试。据悉，Google与巨鲸音乐网合作的音乐搜索将于近期上线。以下为其测试全文：

今天中午，进入谷歌音乐搜索的域名（www.g.cn/music），非常惊奇地发现，这个页面已经可以访问。不过，过了几分钟后，又还魂到之前的404界面。以下是界面截图：

当与G共舞进入页面时，谷歌音乐搜索的首页，是“新歌一百强”的列表页面，左上角是谷歌常有的导航，右上角是“打开播放器”的链接，Logo使用的是普通Logo，并未制作频道Logo。在谷歌音乐搜索中，谷歌提供了试听、下载、歌词和彩铃服务，其中，试听的链接地址是www.google.cn/music/top100/，下载和歌词都是g.top100.cn，而彩铃则将用户引导至中国移动12530的页面。

在谷歌音乐搜索中，可以根据歌手、歌名和专辑名进行搜索。在顶部的搜索框下方，有三个下拉菜单进行导航，分别是歌曲排行榜：歌曲200强、新歌100强、摇滚歌曲、影视金曲、民族歌曲；歌手排行榜：歌手100强、乐队组合、男歌手、女歌手、大陆歌手、港台歌；专辑排行榜：专辑100强、新碟100强、摇滚专辑、影视原声大碟、中国民乐专辑、新歌100强。

搜索的效果如何？当我搜索“周杰伦”时，只有周杰伦与温岚合唱的一首《屋顶》，周杰伦与李玟合唱的《刀马旦》，以及歌手信息，其他，都是与周杰伦不怎么有关联的。这说明，谷歌尚未与周杰伦歌曲的版权公司谈判完成。在谷歌音乐搜索的帮助文档（现在也无法打开了）中，谷歌对此说明：若无法搜索到音乐，说明谷歌的合作伙伴巨鲸音乐网尚未与这家唱片公司谈判成功，“请与巨鲸公司联系”，-_-|||，让用户与巨鲸公司联系干吗？！

我猜想，谷歌公司一定很郁闷，偶尔的一次公开测试，就被人不小心看到了，而且是被我看到的，哈哈～不过估计也会是另一种心情，有人帮忙炒作了。只是苦了我，要被网友骂为“和kissbaidu”一样的献媚博客了……

清心搜索改版了，增加了视频搜索

清心搜索空档了很久，今天有幸改版了，我们也特地增加了视频搜索。

这个新的视频搜索，不用离开页面，只需在页面搜索视频后，便可直接在上面观看。

http://search.qxinnet.com

http://search.qxinnet.com/video.html

大家请参观吧！

以下是截图，让大家先睹为快

2008年12月21日星期日

MYML VS FBML

抱歉，此文章不是正宗的 MYML Versus FBML 的文章。

写这篇文章的时候，我的MYML还是无法真正运行，FBML就可以。

Facebook 上面我的朋友也比较多，而且 Facebook 的市场其实还比较大，而且已经国际化了，马来文都能看到。

所以，我想表达的是，MYML 做得还不够 FBML 好。

很难国际化啊，这样！

UCHOME Feed 信息修复报告

A: 1、打开UCHome目录下的 ./uc_client/control/feed.php

if($feedlist) {
foreach($feedlist as $key => $feed) {
$feed['body_data'] = $_ENV['misc']->string2array($feed['body_data']);
$feed['title_data'] = $_ENV['misc']->string2array($feed['title_data']);
$feedlist[$key] = $feed;
}
}
if(!empty($feedlist)) {
$maxfeed = array_pop($feedlist);
$maxfeedid = $maxfeed['feedid'];
$feedlist = array_merge($feedlist, array($maxfeed));
if($delete) {
$this->_delete(0, $maxfeedid);
}
}

这一段，替换为

if($feedlist) {
$maxfeedid = $feedlist[0]['feedid'];
foreach($feedlist as $key => $feed) {
$feed['body_data'] = $_ENV['misc']->string2array($feed['body_data']);
$feed['title_data'] = $_ENV['misc']->string2array($feed['title_data']);
$feedlist[$key] = $feed;
}
}
if(!empty($feedlist)) {
if(!isset($delete) || $delete) {
$this->_delete(0, $maxfeedid);
}
}

2、打开 UCenter 目录下的 ./control/feed.php

找到

if($feedlist) {
foreach($feedlist as $key => $feed) {
$feed['body_data'] = $_ENV['misc']->string2array($feed['body_data']);
$feed['title_data'] = $_ENV['misc']->string2array($feed['title_data']);
$feedlist[$key] = $feed;
}
}
if(!empty($feedlist)) {
$maxfeed = array_pop($feedlist);
$maxfeedid = $maxfeed['feedid'];
$feedlist = array_merge($feedlist, array($maxfeed));
if(!isset($delete) || $delete) {
$this->_delete(0, $maxfeedid);
}
}

之后保存

替换为

if($feedlist) {
$maxfeedid = $feedlist[0]['feedid'];
foreach($feedlist as $key => $feed) {
$feed['body_data'] = $_ENV['misc']->string2array($feed['body_data']);
$feed['title_data'] = $_ENV['misc']->string2array($feed['title_data']);
$feedlist[$key] = $feed;
}
}
if(!empty($feedlist)) {
if(!isset($delete) || $delete) {
$this->_delete(0, $maxfeedid);
}
}

之后保存

2008年12月20日星期六

BidVertiser

To put bidvertiser advertisements on your website, please come to http://www.bidvertiser.com to apply as a publisher, therefore you can earn your own.

For me, I have registered and I have placed a advertisement on the left panel on this blog, as you can see.

I am very pleased to tell you this advertisements website is very good, it is better than google adsense because it can earn more money. It is true.

活用 UCenter 1.5 发现用户中心的价值

UCenter 是 Discuz! 6.1 之后出现的一个关于用户中心管理的程序。UCenter 作为用户中心从 Discuz! 中分离出来，增强了 BBS 应用的整合能力，UCenter作为一个用户数据核心，将最有商业价值的用户数据进行统一整合，通过各种建站产品的应用发挥更大的作用和价值。
UCenter 的中文意思就是“用户中心”，其中的 U 代表 User 也代表 You ，取其中的含义就是“用户中心”，或者说“你（最终用户）的中心”。UCenter 是社区建站产品之间信息直接传递的一个桥梁，通过 UCenter 站长可以无缝整合系列建站产品，实现用户的一站式登录以及社区其他数据的交互。
UCenter 可以和任意一个程序（需遵守API通信协议规则）深层进行整合，完全取代传统的“通行证”系统。一个 UCenter 可以跨地域、跨服务器联通多个站点或者应用。比如某个站点论坛，因为规模不断扩大，需要建立分站，通过 UCenter，您可以将分站点建立在任意的地点，而后指定 Ucenter 的地址即可，论坛原来用户可以在各个分站点进行畅游，而所有分站点的会员管理都将是统一的。分站点会员之间可以互为好友和发送短信。当然，您也可以将支持 UCenter 应用的第三方程序，如 BBS、SNS、CMS 等整合到 UCenter 当中，所有会员资料可以共享。
与传统通行证不同的是 UCenter 可以单独架设，也可以和某个最重要的应用捆绑在一起。UCenter 可以完成用户统一的登录，注册，管理，但是会保持原有应用习惯，您可能不会感觉到 UCenter 的存在。UCenter 统一接管站点短消息，好友，头像，供所有程序统一使用。UCenter 可以为各个应用建立交互通道，比如：在论坛看帖子的时候，UCenter 接口可以将这个站点上相关资源作为帖子的补充内容展示给用户，比如相关视频，相关商品，相关博客。所有应用将不再是孤立的，而是会紧密结合在一起。UCenter 可以建立更加紧密的人际关系。您可以在多个应用中同步您的好友关系。
在 UCenter 中所有应用设置用户的头像是统一调用的 UCenter 提供的 FLASH 方式设置的，并且按照一定的算法统一保存到 UCenter 的data/avatar 目录下面。在 UCenter 1.5 中充分考虑了广大站长朋友的意见和建议，将一些大家认为不便的地方做了大幅度的改进，力求让站长享受到 UCenter 强大扩展性的同时，让操作更简单更容易上手。UCenter 1.5 在兼容性上做了优化，让整套程序能在更多更复杂的环境下运行。增加了纯 MYSQL 方式的数据通信，方便那些对远程访问限制比较严格的空间安装 UCenter。
新版本的上传头像操作更加简单明了，UCenter 1.5 对上传头像的 Flash 部分做了重新开发和优化。同时支持的 GIF 动画的上传和裁剪，让用户的头像更有个性。
UCenter 1.5 是一个免费开源并相对独立的用户中心。UCenter 的推出为 Discuz! 论坛会员服务多元化创造了条件，使得 Discuz! 更加专注的回归到 BBS 主题价值的聚焦。UCenter 拥有机制完善的接口，经过简单修改便可以挂接其它任何平台的第三方的网络应用程序，随时为您的社区论坛增加能量。面对互联网应用的多元化，有更多的 Discuz! 论坛通过 UCenter 整合了更多的建站产品，为会员提供了更多的在线服务。采用 UCenter 1.5 可以更好的整合 SNS 建站程序 UCenter Home 1.5，更有利于提升网站会员的价值，增强了网站的核心竞争力。