Friday, April 23, 2004

Mastering Search Engine Advertising

By Chris Sherman, Associate Editor
April 22, 2004

Buying your way to the top of search results may seem easy, but managing an effective search engine advertising campaign requires a thoughtful approach with more than a little elbow grease.

Search engine advertising has exploded in popularity over the past few years, offering marketers top positioning and providing searchers with appealing alternatives to annoying banners and popups. To newcomers, the process of bidding on keywords, then sitting back and watching site visitors and online sales start rolling in, seems easy and painless.

That is, until the hapless advertiser gets into a bidding war with a non-rational competitor. Or when the boss asks for specific ROI numbers to justify the expense of a search advertising campaign. These are just two of many issues that can bedevil a marketer setting out on a search advertising effort.

Search Engine Advertising, a new book by SearchDay guest writer Catherine Seda, offers a wealth of information about the entire process of creating and running an effective search advertising campaign. It's an excellent book, written by a pro who not only understands the mechanics of search engine advertising, but also has the ability to describe sometimes difficult concepts with ease and skill.

Importantly, the book starts off with a critical but often overlooked activity: planning a successful strategy. The apparent simplicity of search engine advertising can be a trap for those who don't lay a solid foundation for their efforts.

Long before you place your first bid, it's crucial to go through the often challenging process of keyword research. It's also important to spend time writing your ads, creating compelling copy that entices people to click through to your web site. Another key activity is creating effective landing pages, your one chance to convince your visitor to spend more time on your website and ultimately "convert" -- buy a product, register for an account, subscribe to a newsletter. Seda covers these topics in depth, offering numerous tips and strategies from her own extensive experience.

The book also has excellent coverage of paid placement and paid inclusion programs, their strengths and weaknesses, and when and how to most effectively use them. It also covers specialized search engines -- shopping search engines, and targeted search engines that focus on a particular niche, or cater to residents of different regions of the world.

Once you've created and implemented a search engine advertising campaign, you must monitor it constantly, measuring success (or failure) and continually tweaking your approach. Part four of the book provides detailed information about performance measurement, including a look at some popular bid management, analytics, and ROI tracking tools and solutions.

The book wraps up with a section on protecting your profits. Although it's not widely discussed, click fraud can and does occur with paid placement programs. Other problems include improper use of your trademarks by competitors, or affiliates using tactics that may do you more harm than good. Seda shows you how to identify these types of problems, and also offers useful strategies and tips for dealing with them before they become major issues.

Full disclosure: I was a reviewer for this book prior to publication. I'm going to repeat what I wrote then:

The wealth of accurate, savvy information contained in this book makes it a must-read for anyone promoting products or services online. Using even a few of the tips and techniques offered in Search Engine Advertising will boost your results significantly, paying for the book many times over.

fr.: http://searchenginewatch.com/searchday/article.php/3343371

Thursday, April 22, 2004

Females More Likely to Fly Search Coop, Finds Part Two of iProspect Survey

By Kate Kaye
Contributing Writer
Tuesday, April 20, 2004

Who knew search marketers could have abandonment issues? From the looks of the second installment of iProspect's Search Engine User Attitudes Survey, this time focusing on search user perseverance, search abandonment can affect marketers. Among the surprises: unemployed users and female users are less likely to hang in there than their Web counterparts.

While the search marketing company's study shows that nearly 99 percent of Internet users perform searches, many are unwilling to scroll through countless pages before modifying their search terms or switching to another search engine. According to the report, 22.6 percent of searchers bow out after viewing just the first few results, more than 18 percent do so after perusing the first page, over 25 percent leave after the first two pages, and 14.7 percent stick around through the first three pages before ducking out.

Study participants were asked: "If you do not find what you are looking for, at what point do you move on either to another search engine or to another search on the same engine?" The study measures search stick-to-itiveness based on user age, gender, profession, and employment status. "These demographics are where we saw the trends," says iProspect CEO Fredrick Marckini, who suggests that according to the study numbers, 80 percent of users won't see a listing if it does not fall within the top three pages.

Defying their reputation for nesting, 44 percent of women respondents were apt to fly the search coop after viewing the first page of results, as compared to 37 percent of male study participants. Women "tend to go directly to brands that they know and trust for advice to save time," asserts Lauren Wiener, managing director at Meredith Interactive, publisher of American Baby and Ladies' Home Journal's LHJ.com. Because they are extremely busy with demands at work and home, she continues: "They do use search engines to find specific information, but will quickly abandon them if the first page of results are too generic." Meredith buys targeted search engine keywords like "chicken recipes" to promote subscription sales.

The older search engine users become, the more they're prepared to pack it in after one results page. In fact, compared to 32.2 percent of 18- to-29- year-olds and 44.3 percent of 45- to-59-year-olds, nearly 50 percent of those ages 60 years and up said "goodbye" after viewing page one of results. "I suspect that this probably has to do with a feeling of compressed time, or could be evidence of searches being more relevant to them," Marckini conjectures.

What folks do for a living may also play a role in their willingness to ride out a search. More than any other occupation measured, over 52 percent of homemakers leave searches after viewing only the first page of results. Educators come in second at 40 percent, IT and MIS professionals come in third at 38.2 percent, and students come in at just over 27 percent.

"I would speculate that students are looking for more sources," says Marckini, adding: "It may evidence some of their research behavior where obvious results at the top of the list need to be compared to results deeper down."

The study concludes that when it comes to sites that are likely to target unemployed users such as Monster.com and Manpower, appearing within the top ten search results is critical. While 38.4 percent of full-time workers and 41.1 percent of part-time workers say they'll ditch a search after viewing the first results page, more than 44 percent of unemployed search engine users will quit after checking one listings page. "It demonstrates that being on the first page of search results is more important than ever," notes Marckini, who believes that many people searching on job-related keyword terms through search engines are unemployed. According to Overture's Web site, 500,418 Overture searches on keyword "job search" were performed in March.

"Search engines are really the gateway to the Internet; they're the front door," emphasizes Marckini. "Marketers have to recognize that this is the way the Internet works."

fr.: http://www.mediapost.com/dtls_dsp_news.cfm?newsID=247485

Lindows準備上市 考驗桌上型Linux市場

CNET新聞專區:Matt Hines  21/04/2004

Linux軟體商Lindows週二宣佈,已經向證管會(SEC)提報要進行普通股的IPO(首次股票公開上市)。

公司表示計畫將由舊金山的券商WR Hambrecht輔導上市,但並未提出具體的時間表。Lindows專門針對桌上型電腦與筆記型電腦行銷Linux作業系統軟體與服務。

根據美國證券交易委員會的(SEC)的檔案,Lindows希望通過IPO發行普通股的方式募集5750萬美元資金。該筆融資中的1/5將用來償還Lindows創始人Michael Robertson的1040萬美元的投資。

Lindows向SEC提交的S-1檔案同時首次透露了該公司的財務狀況。2003年,該公司淨損408萬美元,收入207萬美元。2002年,該公司淨損668萬美元,收入63000美元。包括25萬美元現金以及470萬美元債務,該公司目前總資產為93.4萬美元。

Jeffries &Co.的證券分析師Katherine Egbert表示:「市場如何能夠接受一家營收有200萬美金營收,虧損400萬美金的公司。這會相當有趣。」不過這家公司卻認為相當有機會,她表示,Lindows看到的是紅帽及Novell等公司受到高度評價。

S-1檔案同時表明,Lindows公司的獨立審計師對該公司的生存能力表示懷疑。該審計師認為,Lindows虧損連連,而且存在負的現金流動,市場會對該公司的發展潛力表示懷疑。S-1檔案還表明,除了微軟的官司之外,該公司還面臨其他的多項官司。(郭和杰)

fr.: http://taiwan.cnet.com/news/ce/0,2000062982,20089023,00.htm

新駭客工具:巧克力棒

CNET新聞專區:Munir Kotadia  21/04/2004

針對倫敦上班族所做的一項意見調查結果發現,將近四分之三的職員願意以一條巧克力棒的代價洩露自己的公司網路存取密碼。

這項調查的執行者是2004年歐洲資訊安全會議(Infosecurity Europe 2004)的主辦單位。這場資安展覽會預定下周在倫敦登場。主辦單位在利物浦街地鐵站訪問172名通勤上班者,問他們願不願意以公司的網路密碼換取一條巧克力棒。

令人驚訝的是,37%的受訪者立刻點頭答應。另有34%的受訪者後來也不敵採訪人員的誘導(比如暗示他們應該都是用寵物或小孩名字當密碼)而透露他們的密碼。

2004歐洲資安會議活動總監Claire Sellick說,意見調查結果證實,雇主對員工的宣導不足,以至於他們不了解資訊保全的重要性。「追根究柢,還是欠缺訓練和程序的問題。雇主應確定員工了解公司的資訊安全政策,並時時更新政策,」Sellick說。

根據這項調查,大多數受訪者對必須記住各種不同的密碼都感到不悅,寧可使用生物檢定認證方式(例如指紋辨識)或智慧卡(smartcards)。「顯然,員工受夠了必須記住多重的密碼,樂意接受另類的識別技術替代方案,」Sellick說。

上個月在舊金山舉行的RSA安全會議中,微軟公司董事長蓋茲也表示,傳統的密碼逐漸遭到淘汰,因為用密碼來保全重要資料並不可靠。

蓋茲在專題演說中表示:「無庸置疑,久而久之民眾對密碼的依賴會愈來愈少。人們用密碼存取不同的系統。記不住那麼多,只好把密碼寫下來。靠這種作法因應資料保全的挑戰,根本招架不住。」(唐慧文)

fr.: http://taiwan.cnet.com/news/comms/0,2000062978,20089026,00.htm

TCP瑕疵危及網路資料傳輸

CNET新聞專區:Robert Lemos  21/04/2004

英國國家緊急應變中心20日警告,當前最通行的網路資料傳輸通訊協定有個瑕疵,可能讓駭客關閉伺服器與路由器之間的連線。

英國國家基礎建設安全統籌中心(NISCC)發布的這項警示報告說,常用的傳輸控制協定(TCP)包含一種安全漏洞,「視供應商和應用程式而有別,但在某些部署情況之下,據評估會產生重大影響」。報告指出,網路硬體製造商Juniper Networks已認定該公司的產品潛在這種弱點;思科(Cisco)、日立、NEC等公司正在評估中。

這項弱點導致所謂的「重開機型攻擊」(reset attack)。許多網路設備及軟體程式仰賴單一來源提供源源不斷的資料,或連續的「通信期」(session),倘若通信期結束得太早,可能造成裝置發生各式各樣的問題。安全研究人員Paul Watson發現一種更輕易擾亂資料流的方法。

NISCC的報告就是根據Watson的發現而撰寫而成,計劃在本周舉行的CanSecWest 2004年會議中提出,但顯然NISCC已提前一天發表。Watson是立場偏向駭客的Terrorist.net網誌主持人,記者發稿前未能與他本人取得聯繫。

以前也曾發生過與TCP有關的重開機型攻擊,因此在討論此事的郵寄名單中,網友都把此問題斥之為舊聞新炒。但以往的觀念是,發動這類攻擊前,駭客必須猜出在某通信期中下一個資料封包的識別符號(identifier)為何,猜中的機率大約是43億分之一。

NISCC的報告則指出,Watson的研究顯示,在某個數值窗(window of values)範圍內的任何數字都可以用,這讓這類攻擊成功的機率大為提高。

重新開啟網路連線所造成的影響,視應用程式的種類和網路軟體抗干擾的程度而有差別。在某些情況下,一波攻擊可能嚴重擾亂路由器藉以測繪伺服器之間最有效率傳輸路徑的基本網路。這種傳導資料的方式稱為邊界閘道器協定(Border Gateway Protocol,簡稱BGP),必須倚靠長時間的通信期,倘若這些連線遭到干擾,可能造成「中期的斷訊」,NISCC警告。

此瑕疵可能影響名稱網路伺服器(name servers)提供與某網域名稱(例如cnet.com)對號入座的數字型網際網路位址。這種攻擊也可能造成瀏覽器與商務網站之間的安全通道被迫重新開啟,導致電子商務交易受到干擾。(唐慧文)

fr.: http://taiwan.cnet.com/news/comms/0,2000062978,20089028,00.htm

網釣詐騙案激增

CNET新聞專區:Munir Kotadia  21/04/2004

專門以email誘騙網友洩漏資料的網路釣魚詐騙案(phishing)在過去六個月從279件激增至215,643件,安全廠商MessageLabs表示。

網路釣魚騙術專門以偽造的email寄發給使用者,誘騙用戶自行洩漏各式各樣的隱私資訊,包括密碼、用戶名稱等,受害者通常會被誘導點選連結前往一個加工過的冒牌網站輸入資料。

專門監測企業email流量的MessageLabs週一表示,該公司在2003年9月攔截到279件網釣email。到了2004年一月,此類email已經激增至337,050件,之後則在三月份下滑至215,643件。該公司表示無從得知究竟有多少人上當受騙。

去年11成立的防止網釣工作組織(APWG)最近警告所屬金融機構會員,新型態的網釣騙術可在網友的瀏覽器網址列替換上Java小程式,如此一來,攻擊者可將受害者導到任何網站,但卻有可讓瀏覽器網址顯示正牌網站的位址。

根據APWG的網站,此一新型態攻擊在三月底時主要針對花旗銀行客戶,「該攻擊可自動偵測出消費者瀏覽器類別,並安裝一個JavaScript來取代原有瀏覽器網址列的外貌,網友甚至還可在這假的網址列上直接輸入地址,因此受騙機率大幅提高。」該組織的網頁寫到。(陳奭璁)

fr.: http://taiwan.cnet.com/news/software/0,2000064574,20089024,00.htm

推動企業網格 大廠步伐分歧

CNET新聞專區: Ed Frauenheim  21/04/2004

電腦業的重量級大廠在週二組成聯盟,要在大型企業裡推動所謂的網格運算,但市場仍存有許多疑慮。

新的組織名為「企業網格聯盟」(Enterprise Grid Alliance,簡稱EGA),創始成員包括了甲骨文(Oracle)、惠普(HP)、英特爾(Intel)、NEC、昇陽(Sun Microsystems),以及Fujitsu Siemens。

該聯盟的目的是要加速業界採用網格運算──也就是把電腦、儲存裝置,及網路連結成一個集中化的運算資源,讓企業可以依據企業需求而改變及分配這些資源。該聯盟表示,正與現有的一些產業協會及標準組織合作,而且可能開發一些自己的規格。

「由於有專注於企業的務實方法,EGA地位相當獨特,可為企業提供近程而實質的好處。」EGA總裁Donald Deutsch在聲明中表示。Deutsch同時也是甲骨文的標準策略及架構副總裁。

然而該組織仍面臨挑戰──只少首先要面對IBM、微軟、SAP等大廠缺席的事實。在商業的運算領域裡,所有的大廠都扮演著很重要的一個角色,然而IBM及微軟卻支持別的組織。

Illuminata分析師以電子郵件回覆記者問題時表示:「IBM、Platform Computing、Avaki、Verari等公司的缺席相當啟人疑竇。」

在週二的記者會上Deutsch也坦承這個問題。

他表示:「的確,我們也與IBM談過。也和微軟及SAP談過。」「我們很歡迎這些公司的參與。」

IBM一直是推動網格運算技術的主要大廠,IBM的發言人在週二表示:「我們正在評估這個組織的目標和使命。」「我們會在取得更多資訊之後再作決定。」

該聯盟所面臨的另一問題是,是否會與現有的其他組織角色疊床架屋,或者甚至只是徒增混亂,例如「全球網格論壇」(Global Grid Forum)。目前為止,網格以及其相關的概念,例如將運算轉型為公用事業型態的服務,其前景都還相當不明。

例如,惠普在談到「活成長企業」(Adaptive Enterprise)的願景時,就很努力要說明「把運算系統連到商業流程」是什麼意思。

全球網格論壇在聲明中暗示,EGA可能沒存在的必要:「身為全球的論壇,GGF有其代表性,建立一個可以解決艱深問題的場所與流程,包括標準的制定以及部署的經驗。企業網格聯盟已經另立一個獨立的組織,為企業的網格部署憑添問題。」

此外,GGF表示願意和新進者EGA合作,以了解他們的具體目標和規畫。

IBM與微軟的研究部門都是GGF的主要贊助者。

針對EGA如何與全球論壇區分的問題,Deutsch強調,聯盟有自己專注的重點。「GGF的範圍比EGA廣。」他表示,EGA並不關心科學運算或學術研究的領域,反而主要專注在所謂的ERP及CRM等商業應用上。

但Illuminata的Eunice懷疑新聯盟可能無用武之地。

「就EGA所要做的事來看,實務面上並不見得可如此明確劃分。基礎的網格協定、標準,以及參考建置,過去在GGF及Globus Alliance支持下發展得很好。」他表示。「如果EGA真的有做出什麼具體的東西,那就好──但我怕EGA根本就是行銷組織一個。」同時Eunice也表示,EGA大部份的創始成員都不是網格運算領域裡的領導者。

另還有一個相似的組織DCML Organization想要推動不同的電腦之間的資源連結。該組織去年成立,目的是要建立「資料中心標示語言」(Data Center Markup Language)標準,讓不同廠牌的電腦設備可以共享重要的操作資訊。DCML同樣也因重要大廠的缺席而陷入苦境。雖然創始會員不乏EDS及CA等大廠,但是IBM、昇陽及微軟卻都缺席。(郭和杰)

fr.: http://taiwan.cnet.com/news/hardware/0,2000064553,20089029,00.htm

Wednesday, April 21, 2004

URLS! URLS! URLS!

by Bill Humphries

Looking around the web, you’ve run across plenty of URLs that look like:

/content.cgi?date=2000-02-21/article.cgi?»
id=46&page=1Server side scripts generate the content of those pages. The content of a particular page is uniquely determined by the URL, just as if you requested a page with the URL /content/2000-02-01.html or /article/46.1.html. These pages are different than server-generated pages created in response to a form like a shopping cart, or enrollment. However, search engines will not index these content pages, because search engines ignore pages generated by CGI scripts as potential blind alleys.

A search engine would follow a URL like

/content/2000/02/21,

so some way of mapping a URL like /content/2000/02/21 to the script /content.cgi?date=2000-02-21 would be useful. Not only will search engines follow such a link, but the URL itself is easy to remember. A frequent visitor to the site would know how to reach the page for any day the site published content. When I changed the interface for viewing entries by topic in my WebLog from /meta.php3?meta=XML to /meta/XML, search engines such as Google started indexing, and I’m getting more visits referred by search engines.

The trick is to tell the outside world that your interface is one thing: /content/YYYY/MM/DD, but when you fetch the page, you’re accessing /content.cgi?date=YYYY-MM-DD. Web servers such as Apache and content management systems such as Userland’s Manila and the open source Zope support this abstraction.

The abstraction is also useful because a site’s infrastructure is rarely stable over time. When engineering replaces the Perl CGI scripts with Java Server Pages, and the URLs become /content.jsp?date=YYYY-MM-DD, your users’ bookmarked URLs break. When you use an abstraction, your users bookmark /content/YYYY/MM/DD, and when you change your back end, you update /content/YYYY/MM/DD to point at /content.jsp?date=YYYY-MM-DD without breaking bookmarks.

If you’re not publishing content dynamically, and have URIs like:

/content-YYYY-MM-DD.html,

you don’t have the problem with indexing that the dynamic content has. However, you still may want to adopt this type of URI for consistency with other sites. Remember people coming to your site want to use an interface they are familiar with, and URIs are part of your interface.

Rewriting the URL in Apache
The Apache Web server is ubiquitous on both Unix and NT, and it has an optional component, mod_rewrite, that will rewrite URLs for you. It isn’t part of the standard install. Pair Networks, Dreamhost, and Hurricane Electric, have it enabled on their servers. If you are running your own server, check with your systems administrator to see if it’s installed, or have her install it for you.

The mod_rewrite module works by examining each requested URL. If the requested URL matches one of the URL rewriting rules, that rule is triggered, and the request is handled by the rewritten URL.

If you’re not familiar with Apache, you’ll want to read up on how its configuration files work. The best place to run mod_rewrite from is your server’s httpd.conf file, but you can call it from the per directory .htaccess file as well. If you don’t have control of your server’s configuration files, you’ll need to use .htaccess, but understand there’s a performance hit because Apache has to read .htaccess every time a URL is requested.

The Goal
The goal is to create a mod_rewrite ruleset that will turn code such as that shown below:

/content/YYYY/MM/DDinto a parameterized version such as is shown next, or into something similar, as long as it’s the right URI for your script.

/content.cgi?date=YYYY-MM-DDThe Plan
We start with the URI /content/YYYY/MM/DD and want to get to /content.cgi?date=YYYY-MM-DD. So we need to do a few things:

Recognize the URI
Extract /YYYY/MM/DD and turn it into YYYY-MM-DD
Write the final form of the URI /archives.cgi?date=YYYY-MM-DD
Regular Expressions and RewriteRule
This transform will require two of the directives from mod_rewrite: RewriteEngine and RewriteRule. RewriteEngine’s a directive which flips the rewrite switch on and off. It’s there to save administrators typing when they want or need to disable rewriting URLs. RewriteRule uses a regular-expression parser that compares the URL or URI to a rule and fires if it matches.

If we’re setting the rule from the directory it fires using the .htaccess file, then we need the following:

RewriteEngine On
RewriteRule ^archives/([0-9]+)/([0-9]+)/([0-9]+)»
archives.cgi?date=$1-$2-$3What that rule did was first match on the string ‘archives’ followed by any three groups of one or more digits (the [0-9]+) separated by ‘/’s, and rewrote it as archives.cgi?date=YYYY-MM-DD. The parser keeps a back reference for each match string in parentheses, and we can substitute those back in using $1, $2, $3, etc.

If your page has relative links, the links will resolve as relative to /archives/YYYY/MM/DD, not /archives. That means your relative links will break. You should use the base element in the head of the page to reanchor the page.

RewriteRule for Static Content
If you have a series of static HTML files at your document root:

/content-1999-12-31.html
/content-2000-01-01.html
/content-2000-01-02.html...and want your readers to access them with URLs like /archives/1999/12/31, then you would need a rewrite rule at the document root, such as:

RewriteRule ^archives/([0-9]+)/([0-9]+)/»
([0-9]+)$ /news-$1-$2-$3.html
RewriteRule ^archives$ /index.htmlIf the news-YYYY-MM-DD.html files are in a folder called /archives, the rewrite rule should be:

RewriteRule ^/archives/([0-9]+)/»
([0-9]+)/([0-9]+)$ /archives/»
news-$1-$2-$3.htmlIf you want to use an .htaccess file at the archive folder level, then the rule becomes:

RewriteRule ^([0-9]+)/([0-9]+)/»
([0-9]+)$ news-$1-$2-$3.htmlAlso, you may delete the second rewrite rule since you can use a DirectoryIndex rule instead.

DirectoryIndex index.htmlCorner Cases
What if someone enters http://www.yoursite.com/archives instead of http://www.yoursite.com/archives/YYYY/MM/DD? The rule is that mod_rewrite steps through each rewrite rule in turn until one matches or no rules are left. We can add another rule to handle that case.

RewriteEngine On
RewriteRule ^archives/([0-9]+)/([0-9]+)/([0-9])+»
archives.cgi?date=$1-$2-$3
RewriteRule ^archives$ index.htmlIn this case, redirect to an index page. But you could redirect to a page that generates a search interface.

What If My Server’s not Apache?
Unfortunately IIS does not come with a rewrite mechanism. You can write an ISAPI filter to do this for you.

If you are running the Manila content management system that comes with Userland’s Frontier, the options allow you to map a particular story in the system to a simple URL.

The Zope publishing system also supports mapping of paths into arguments for server scripts.

References
Good URLs are part of interface design. Jakob Nielsen discusses this in his Alertbox column: http://www.useit.com/alertbox/990321.html.

This article was inspired in part by Tim Berners-Lee’s observation that good URLs don’t change: w3.org/Provider/Style/URI

Rafe Engelschall has many examples of mod_rewrite in ‘cookbook’ form at his site: http://www.engelschall.com/pw/apache/rewriteguide/.

I use these techniques to create a standard interface to my weblog.

Bill Humphries has been developing for the web since 1995. He runs the More Like This weblog covering XML, web publishing and whatever other esoteric items he likes. By day, he works for a company in Silicon Valley where he helps push the web onto wireless devices.

fr.: http://alistapart.com/articles/urls

How to Succeed With URLs

by Till Quack
If you’re building or maintaining a dynamic website, you may have considered the problem of how to get rid of unfriendly URLs. You might also have read Bill Humphries’s ALA article on the topic, which presents one (very good) solution to this problem.

The main difference between Bill Humphries’s article and the solution I will present here is that I decided to do the actual URL transformations with a PHP script, whereas his solution uses regular expressions in an .htaccess file.

If you prefer working with PHP instead of using regular expressions, and if you want to integrate your solution with your dynamic PHP sites, this might be the right method for you.

Why worry about URLs?
Good URLs should have a form like /products/cars/bmw/z8/ or /articles/january.htm and not something like index.php?id=12. But the latter is the kind of URL most publishing systems generate. Are we stuck with bad URLs? No.

The idea is to create “virtual” URLs that look nice and can be indexed by bots (if you set your links this way also) – in fact, the URLs for your dynamic content can have any form you like, but at the same time static content (that might also be on your server) can be reached by its regular URL.

When I built my new site, I was looking for a way to keep my URLs friendly by following these steps:

A user enters a URL like www.mycars.com/cars/bmw/z8/
The code checks to see if the entered URL maps to an existing static HTML file
If yes, the file is loaded, if no, step 4 is executed
The URL string is used to check if there is dynamic content corresponding to the entered URL (e.g. in a database).
If yes, the article will be displayed
If no, an Error 404 or a custom error message will be displayed.
A Collection of tools
This article will provide you with all the information necessary to implement this solution, but it’s more a collection of tools than a complete step-by-step guide to a finished solution. Before you start, make sure you have the following:

mod_rewrite and .htaccess files
PHP (and a basic understanding of PHP programming)
a database like mySQL (optional)
The index takes it all
After browsing the web and checking some forums, I found the following solution to be the most powerful: All requests (with some important exceptions – see below) for the server will be redirected to a single PHP script, which will handle the requested URL and decide which content to load, if any.

This redirection is done using a file named .htaccess that contains the following commands::

RewriteEngine on
RewriteRule !\.(gif|jpg|png|css)$ /your_web_root/index.phpThe first line switches the rewrite engine (mod_rewrite) on. The second line redirects all requests to a file index.php EXCEPT for requests for image files or CSS files.

(You will need to enter the path to your web-root directory instead of "your_web_root". Important: This is something like "/home/web/" rather than something like "http://www.mydomain.com.")

You can put the .htaccess file either in your root directory or in a sub-directory, but if you put the file in a sub-directory, only requests for files and directories "below" this particular directory will be affected.

The magic inside index.php
Now that we’ve redirected all requests to index.php, we need to decide how to deal with them.

Have a look at the following PHP Code, explanations follow below.

//1. check to see if a "real" file exists..

if(file_exists($DOCUMENT_ROOT.$REQUEST_URI)
and ($SCRIPT_FILENAME!=$DOCUMENT_ROOT.$REQUEST_URI)
and ($REQUEST_URI!="/")){
$url=$REQUEST_URI;
include($DOCUMENT_ROOT.$url);
exit();
}

//2. if not, go ahead and check for dynamic content.
$url=strip_tags($REQUEST_URI);
$url_array=explode("/",$url);
array_shift($url_array); //the first one is empty anyway

if(empty($url_array)){ //we got a request for the index
include("includes/inc_index.php");
exit();
}

//Look if anything in the Database matches the request
//This is an empty prototype. Insert your solution here.
if(check_db($url_array)==true()){
do_some_stuff(); output_some_content();
exit();
}

//3. nothing in DB either Error 404!
}else{
header("HTTP/1.1 404 Not Found");
exit();
}
Step 1, lines 1-9: check to see if a “real” file exists:
First we want to see if a existing file matches the request. (This might be a static html file but also a php or cgi script.) If there is such a file, we just include it.

On line 3, we check to see if a corresponding file is in the directory tree using $DOCUMENT_ROOT and $REQUEST_URI. If a request is something like www.mycars.com/bmw/z8/, then $REQUEST_URI contains /bmw/z8/. $DOCUMENT_ROOT is a constant which contains your document root – the directory where your web files are located.

Line 4 is very important: We check to see if the request was not for the file index.php itself – if it were, and we just went ahead, it would lead to an endless loop!

On line 5, we check for another special case: a REQUEST_URI that contains a "/" only – that would also be a request for the actual index file. If you don’t do this check, it will lead to a PHP Error. (We will deal with this case later on.)

If a request passes all these checks, we load the file using include() and stop the execution of index.php using exit().

Step 2, lines 14-28: check for dynamic content:
First, we transform the $REQUEST_URI to an array which is easier to handle:

We use strip_tags() to remove HTML or JavaScript tags from the Query String (basic hack protection), and then use explode() to split the $REQUEST_URI at the slashes ("/"). Finally, using array_shift(), we remove the first array entry because it’s always empty. (Why? Because $REQUEST_URI always starts with a "/").

All the elements of the request string are now stored in $url_array. If the request was for www.mycars.com/bmw/z8/, then $url_array[0] contains "bmw" and $url_array[1] contains "z8." There is also a third entry $url_array[2] which is empty – if the user did not forget the trailing slash.

How you deal with this third entry depends on what you want to do; just do whatever fits your needs.

What if that $url_array is empty? You may have realized that this corresponds to the case of the $REQUEST_URI containing only a slash ("/"), which I mentioned above.

This is the case when the request is for the index file (www.mycars.com or www.mycars.com/). My solution is to just include the content for the mainpage, but you could also load an entry from a database.

Any other request is now ready to use. At this point your creativity comes into play – now you can use the URL elements to load your dynamic content. You could, for example, check your database for content that matches the query string; this is sketched in pseudo code on lines 25-28.

Suppose you have a string like /articles/january.htm. In this case, $url_array[0] contains “articles” and $url_array[1] contains “january.htm.” If you store your articles in a table "articles" that includes a column "month," your code could lead to a query like this:

str_replace (".htm","", $url_array[1]);
//removes .htm from the url
$query="SELECT * FROM $url_array[0] WHERE
month='$url_array[1]'";You could also transform the $url_array and call a script, much as Bill Humphries suggests in his article. (You need to call the script via the include() function.)

Step 3, lines 30-32: nothing found.
The last step deals with the case that we neither found a matching static file in step one, nor did we find dynamic content matching the request – that means that we have to output an Error 404. In PHP this is done using the header() function. (You can see the syntax to output the 404 above.)

Beware of hackers
One part of this procedure creates a few vulnerabilities. In step one, when you check for a existing file, you actually access the file system of your server.

Usually, requests from the web should have very limited rights, but this depends on how carefully your server is set up. If someone entered ../../../ or something like /.a_dangerous_script, this could allow them to access directories below your web-root or execute scripts on your server. It’s usually not that easy, but be sure to check some of those possible vulnerabilities.

It’s a good idea to strip HTML, JavaScript (and maybe SQL) tags from the querystring; HTML and Javascript tags can easily be removed using strip_tags(). Another wise thing to do is limit the length of the query string, which you could do with this code:

if(strlen($REQUEST_URI)>100){
header("HTTP/1.1 404 Not Found"); exit;
}If somebody enters a query string of more than 100 symbols, a 404 is returned and the script execution is stopped. You can just add these (and other security related functions) at the beginning of the script.

How to deal with password protected directories and cgi-bin
After I had implemented the whole thing, I realized that there was another problem. I have some password protected directories, e.g. for my access statistics. When you want to include a file in one of these directories, it won’t work because the PHP Module has a different user which cannot access this directory.

To solve this problem, you need to add some lines to your .htaccess file, one for each protected directory (in this example the directory /stats/):

RewriteEngine on
RewriteRule ^stats/.*$ - [L]
RewriteRule !\.(gif|jpg|png|css)$ /your_web_root/index.phpThe new rule on the second line excludes all access for /stats/ from our redirection rule. The "-" means that nothing is done with the request, and the [L] stops execution of the .htaccess if the rule at this particular line was applied. The original rule on the third line is applied to all other requests.

I recommend the same solution for your cgi-bin directory or other directories where scripts that take GET queries reside.

fr.: http://alistapart.com/articles/succeed

Are You Playing Hide and Seek with Search Engines?

Search engines do have the ability to spider secure- server-hosted pages, but often these pages require either that a visitor fill out a form or log in with a password and user name before being allowed past a certain point. If any page requires filling out of forms or passwords to reach, search engine robots will simply leave. They can't log in because they can't fill out forms, leave email addresses or enter passwords.

A Webmaster for a 4,500-page ecommerce web site contacted me. He wondered why search engines were ignoring such a large site. I asked for the URL of the site and visited the home page. I noted that upon loading, there was an immediate passing of the URL http://anybusiness.com site to a secure httpS://anybusiness.com page. This has two immediate faults that may be a problem -- the forwarding method and different server. If the instant forward is by JavaScript, then it's bad news.

First, search engines often either penalize or downgrade sites that use immediate URL forwarding, especially from a home page. URL forwarding suggests doorway pages (a search engine no-no) or affiliate URLs forwarding to an affiliate program site, or the worst of all scenarios, cloaking software on your server. You may not be doing any of these things, but the robots don't know, don't care, and don't index your site, plain and simple.

Secondly, secure servers are very often a separate web site, meaning that the secure server is actually a different machine and is an entirely different site from the non-secure server site unless your site is hosted on a dedicated server on its own IP address, with a security certificate at the same domain. This can happen when secure shopping carts are hosted by a third-party host so that a small ecommerce site needn't purchase a security certificate or set up complex shopping carts.

For example, if your shopping cart is hosted by Yahoo stores or other application service providers (ASPs), pages hosted in the shopping cart don't reside on your domain and can't be recognized as pages on YOUR site unless you also host your domain with the same company. Unfortunately, many shopping cart ASPs use dynamic IP addresses (IP address is different each time you visit) and use database-generated dynamic pages.

The process of serving dynamic pages is not the problem. The problem is simply that the URL of those pages contains several characters that either stop or severely curtail search engine spiders. Question marks (?) are the biggest culprit, followed by ampersands (&), equal signs (=), percent symbols (%) and plus signs (+) in the URLs of dynamic pages.

These symbols serve as alarm bells to the spiders and either turn them away entirely or dramatically slow the indexing of your pages. This is stated simply in the Google "Information for Webmasters" page.

1. Reasons your site may not be included.

"Your pages are dynamically generated. We are able to index dynamically generated pages. However, because our web crawler can easily overwhelm and crash sites serving dynamic content, we limit the amount of dynamic pages we index."

Just because your site is dynamically generated, creating long URLs full of question marks, equal signs and ampersands like www.domain.com/category.asp?ct=this+% 28that+other%29&l=thing doesn't mean you are in search engine limbo. There are simple solutions available for your Webmaster. Here are a couple of articles explaining an elegant solution called "mod_rewrite."

You can read about that technique if technically inclined:

http://alistapart.com/articles/urls
http://alistapart.com/articles/succeed

This technique is simply creating a set of instructions for your web server to present URLs in a different form that replaces those "bad" question marks and ampersands with slash marks (/) instead. The method will require that your Webmaster is a bit more technically savvy than most home-business CEOs who created their own web site. Some hosts will help here by simply turning on the "mod_rewrite" for shared hosting clients.

Don't play hide and seek with the search engines! Tell them exactly where to find every page on your site -- and if there's any question that they will find every page on your site, give them a map.

Hard-code those dynamic URLs for most subcategories within the categories of different sections of your web site into your comprehensive site map. As long as those dynamic links (even those that include ?=+%& symbols) are hard-coded into a site map, the spiders will follow them. Clearly those 4,500 pages mentioned earlier would be too much for a site map listing. But the main category pages could be provided for the engines.

I visited the site map page of the Webmaster mentioned above and saw 14 pages listed on the site map. That explains why they have 14 pages, not 4,500, indexed by Google.

How to find out how many pages of your site are indexed? Go to Google Search and type "allinurl:www.domain.com" (without the quotes, replacing "domain" in the above example with your own domain name). This query operator will return a list of every page of your site. Look in the blue bar across the top of the Google results page and you'll see the number of pages indexed at your site!

That should do it. Get indexed and stop playing hide- and-seek!

fr.: WebAdvantage eMarketing Newsletter April 15, 2004

Search Engine Users: Loyal or Blase?

By Chris Sherman, Associate Editor
April 19, 2004

Searchers are loyal to their favorite search engine, and stubbornly stick with it even if they don't initially find what they're looking for, according to a new survey of web users.

Loyal or lazy? 56.7 percent of Internet users use the same search engine or directory when they are looking for information, according to a new iProspect Search Engine User Attitudes Survey.

Just 30% of web users have a few specific search engines they use on a regular basis. A scant 13% follow the advice that we frequently preach here at Search Engine Watch, using a different search engine depending on what they are looking for.

What are the favorites? No surprises here: Google has what iProspect calls a "loyalty rate" of 66 percent. Yahoo! is next, at 55%, followed by MSN at 54% and AOL at 49 percent.

The survey also found that searchers are hardheaded, with more than 90% tweaking their query and using the same search engine after being dissatisfied with the first three pages of results returned by their initial search. iProspect says that this finding suggests that over time users have developed more confidence in their search engine of choice. Another possible interpretation is that searchers have simply become habituated, or are too lazy to try a different engine when their favorite doesn't measure up.

The survey found that nearly 50% percent of respondents have installed at least one search tool bar. The most popular was Yahoo's, with 22% share, followed by Google's at 20% and MSN's at 17 percent.

fr.: http://searchenginewatch.com/searchday/article.php/3342041

HP執行長:引爆廣播的數位革命

CNET新聞專區:Stefanie Olsen  20/04/2004

惠普(HP)執行長Carly Fiorina週一在美國國家廣播協會(National Association of Broadcasters,簡稱NAB)的大會上對廣播業者表示,如不擁抱數位技術就會落後。

「無疑的,這一個時代的廣播已有一個勢力在成形,那就是數位技術。」Fiorina在NAB大會的專題演講上表示。Fiorina在提到廣播節目的製作、生產,與傳播的轉型時進一步指出:「我們已經進入一個新的時代,所有的一切訊號處理與內容都正從類比、靜態,且實體的,走向數位、行動,與虛擬的。」

雖然這種轉型會在未來十年裡發生,但是Fiorina表示,廣播業者必需思考如何在TiVo等一類可跳過廣告(commercial-skipping)的裝置及節目的數位檔案庫裡,發展出新的商業模式。

她問道:「老師在美國史的課堂上想要在你的數位檔案庫裡找出南北戰爭的電視節目和電影資料時,會怎樣呢?」「問題的答案不只要思考新的經濟模式,同時還要透過新技術的使用。」

Fiorian的論點一直圍繞在惠普內容創作、傳播,與消費方面的相關技術。惠普已經把數位娛樂和公用運算視為未來幾年的成長動力。

在公用運算的計畫下,惠普希望授權廣播業者、電影製片廠,等其他業者使用其運算力(computing power),以減輕業者相關的硬體成本。而在數位娛樂策略下,惠普希望為業者量身訂作相關的產品與服務,並為該產業打造新的服務和裝置,如色彩修正或是影像編輯產品。在消費者端,惠普也打算推出數位娛樂的軟、硬體。

例如,惠普最近就宣佈簽下Nokia的Visual Radio服務合約──這項計畫要讓無線電廣播服務可以透過手機即時達到閱聽者。惠普也在週一宣佈,正與夢工廠(Dream Works)與Warner Bros.等製片廠在數位動畫、編輯、傳播,及電影修復等方面進行合作。

緊接著Fiorina演講之後的是今年把90000人拉這場大會的NAB執行長Edward Fritts。他也談到了數位革命和廣播業者在這場轉型中將扮演領導的角色。另一方面他也試圖淡化網際網路,以強調廣播業者對地方社區服務的重要性。

「上網用搜尋引擎找地方新聞,絕對無法取代廣播業者所提供的立即、即時,及在地化等服務特性。」他表示。「廣播的功能與未來應該不錯。」

Fritts表示,在FCC董事長Michael Powell的命令下,電視的廣播業者正領導走向數位頻道。但他也批評FCC並未要求有線電視業者遵守同樣的數位電視法規,並容許數位電視的多重播放。

他表示:「有將近1200家地區電台用數位訊號廣播,但其中只有三分之一是有線的。」「我們的數位電視及高解析度訊號都準備妥當了但卻沒有出路。」(郭和杰)

fr.: http://taiwan.cnet.com/news/ce/0,2000062982,20089005,00.htm

Tuesday, April 20, 2004

Behind the Scenes at News Aggregator Topix.Net

By Gary Price, Guest Writer
April 13, 2004

Topix.net combines an excellent news search engine with two other hot technologies: local search and personalization.

The Topix database includes full text news stories from over 4,000 sources, including a great deal of content that's difficult to quickly access elsewhere. The real power of this nifty news search engine comes from its easy-to-use pre-built pages that aggregate news and other information into more than 150,000 topic-specific pages.

These specialized pages cover local news and information for every zip code in the United States. There are also news pages dedicated to specific companies, industries, sports teams, actors, and many other subjects.

We interviewed Rich Skrenta, CEO of Topix.Net, via e-mail.

Q. Where did the idea for Topix.Net come from? What made you decide that this service was needed in the current marketplace? What does Topix.Net offer that's not available from other companies?

In 1998 we did a project called NewHoo, which was acquired by Netscape/AOL, and is now called the Open Directory Project (ODP). It used a massive group of volunteers to build the web's largest human-edited directory. The ODP now has 60,000 volunteer editors, and the data powers Google Directory.

Our team left Netscape/AOL in 2002, and rather than using human labor again, we wanted to explore emerging AI (artificial intelligence) techniques for classifying and extracting structured data from the web. The goal for Topix.net is to make a web page about everything -- every person, place, and thing in the world -- constantly machine-summarized from the Internet. Since the web can be a messy place, surfing a well-constructed encyclopedia based on live content from the web would be a win for users.

Rather than starting with a full web crawl, which has 4 billion+ pages, we started with news, which has 4,000 sources, and is very dynamic and high quality content. We don't cover everything in the world yet, but we do have every place in the U.S., every sports team, music artist, movie personality, health condition, public company, business vertical, and many other topics.

Q. Can you share some background about how Topix.Net builds a page? Are pages built automatically or is there some human intervention? Is the technology your own? How long did it take to get it up and running?

We developed separate software modules to crawl, cluster and categorize articles. The heart of our system is a proprietary AI categorizer that uses a massive Knowledge Base (KB) to determine the geographic location and subject categorization for each story.

The final step is the Robo-Editor, which picks the best stories for display. For example, our 2004 Presidential Election page may have seen 1,000 articles for the past hour. The Robo-Editor's job is to pick the 10 best articles to show the user to give them a good overview of the news.

Our system is fully automated, there is no human involvement at any stage. We developed the technology in-house over the past two years. The AI was particularly tricky to get right, since an accuracy rate in excess of 99% was necessary to make the system useful.

Q. Do you have any plans to market your crawling and categorization technology as a source of revenue or providing your services to create Topix.Net pages for companies and other organizations?

We have a commercial feed business for companies that want to enhance their own website offerings with deeply categorized news content. Topix.net offers an extremely rich newsfeed -- in addition to the standard URL, title, and summary, we have the latitude/longitude of the news source, the latitude/longitude for the subjects of the story, the prominence of the news source, the subject categorizations, and more. We can also "geo-spin" any subject category, to produce a locally focused version. These features give us a lot of flexibility to customize feeds for clients.

We're also excited about using our categorization technology to apply to other areas beyond news, such as local web search.

Q. Are you crawling and aggregating web content other than news sources? Do you include press release material?

In addition to newspapers, Topix.net is crawling radio and TV station websites, college papers, and some high school papers and weblogs. We're also crawling government websites with "newsy" public information, such as police department crime alerts, health department reports, OSHA violation announcements, coast guard notices, and news releases from other city, county and state level government entities. We are crawling and including press releases too.

Our focus is on hyperlocal deep coverage of the U.S.. We love police blotters and little papers with extremely local coverage. If your local PTA has online meeting minutes, that's the kind of source we want to add.

Q. Does Topix.Net offer any type of RSS/syndication options?

We have an RSS feed for each of our 150,000 categories. This includes an RSS feed for every ZIP code in the U.S. Topix.net is the largest publisher of non-weblog RSS on the net.

Each of our pages also has an "Add to My Yahoo" button, which drops Topix.net headlines onto your My Yahoo desktop. We worked with the My Yahoo team to pre-load 35,000 of our newsfeeds into their new RSS reader module.

In addition to the RSS feeds, we also have free javascript headline syndication. Website owners can easily add a Topix headline box from any of our categories to their site by including a bit of HTML.

Q. What are Topix.Net's current sources of revenue?

Website advertising and commercial newsfeed sales.

Q. What do you have in the pipeline to further enhance Topix? In other words, what will Topix.Net offer in a year that's not available today? What about local pages for areas outside of the U.S.?

Expanding beyond the U.S. to full worldwide coverage is something we'd like to do. We're also looking at adding personalization features to the site, and using our categorization technology to apply to content beyond just news.

Gary Price is the publisher of ResourceShelf, a weblog covering the online information industry.

fr.: http://searchenginewatch.com/searchday/article.php/3339631

Your Library Online

By Janet Rubenking
February 17, 2004

You've probably visited your public library's Web site to see whether it has a particular book in stock, but you may not realize that library Web sites offer free and easy access to an incredible array of online reference materials that would otherwise be too expensive or otherwise inaccessible to most of us. You can track down a biography of Ludwik Lazar Zamenhof, see the great paintings of Paul Klee, research businesses, or take practice tests to see whether you're ready for the big exam—all from the comfort of your home PC. While many of these references have a decidedly academic feel to them, they provide more and better information than the free Web resources we're used to, such as Google, InfoSpace, and KartOO. The databases are collections of carefully selected material, thereby reducing the clutter of irrelevant results being included because they share some of the same keywords. And since the databases are supplied by services dedicated to their upkeep, you can expect the information in those databases to be more accurate.

Where Do I Start?


Your typical library Web page has links to its OPAC (online public access catalog), library hours and services, librarians' favorite Web links, a list of pages designed for children and their homework needs, and a link to articles and databases. The articles and databases are your keys to up-to-date and archived newspaper and journal articles, reference material (such as encyclopedias and almanacs), indexes with abstracts and citations, testing and learning materials, career guidance, and much more. Large metropolitan libraries offer a greater variety of resources, but even the smallest libraries offer one or two comprehensive databases. If you are lucky enough to live in a state with a public-library consortium, residents anywhere in the state have access to the same subscription databases. The Michigan eLibrary (MeL) is one such resource. To find a library near you, visit the Library of Congress State Libraries listings (www.loc.gov/global/library/statelib.html). Many state libraries have local directories for their library networks.

Remote Resources

A large percentage of the databases are available through remote access, though there may be restrictions on some. Depending on the library, you can sign on using your library card number, password/PIN code, ZIP code, driver's license number, or state ID at the home page.

There's a dizzying array of databases available, and your library probably classifies them by subjects such as arts and humanities, business, science, and education. There are basically three types of formats, though there is some overlap among the resources: Full-text databases have complete articles from magazines, journals, and newspapers. Indexed databases contain abstracts and citations from books, journals, magazines, and reference books. Reference databases offer dictionary- and encyclopedia-type sketches.

Most of the databases have tools to save searches, make lists, and print or e-mail articles and citations. For students, the automatic citations generated from marked lists of articles save hours of tedious formatting. The larger databases feature powerful search mechanisms, so check out the help files for instructions. All the databases are updated periodically, some as frequently as several times a day. The following databases are just a few of the common resources available through most libraries.

Gale/Thomson's InfoTrac Databases. The InfoTrac Databases are some of the oldest and most common databases around. They offer a variety of resources such as Literature Resource Center, Associations Unlimited, and Business and Company Resource, and most libraries subscribe to one or more. Once you enter one of the databases, you can navigate to others without having to log on again. One of InfoTrac's featured tools is InfoMarks. An InfoMark at the top of the page indicates that the URL for the page is persistent and can be saved to your Favorites list for future reference. InfoMarks can be shared with other Gale database users, and they can be copied and pasted into word processing documents, e-mails, and Web pages. Each time a saved search is executed, it accesses the most updated information.

Many libraries offer InfoTrac's Biography and Genealogy Master Index and the Biography Resource Center. The Biography and Genealogy Master Index covers 13.6 million biographical sketches from numerous sources, such as dictionaries, Marquis Who's Who, subject encyclopedias, and volumes of literary criticism. It indexes sources with multiple biographical sketches rather than articles. Gale adds 300,000 new citations with each update, which takes place twice a year.

For full-text biographical information, use the Biography Resource Center, which contains information on almost 315,000 people throughout history and various disciplines. Here there are biographical narratives, thumbnail sketches, Marquis Who's Who entries, and magazine articles pulled from several respected sources. You'll find images and up-to-date reporting from magazines. There is also a research guide to conducting successful biographical research.

Ebscohost Research Databases. The Ebscohost Web of databases provides one easy-to-use tool that lets you search across multiple sources and disciplines. Among the many sources available are Masterfile Premier, with over 1,950 general reference publications; Newspaper Source, with full-text articles from 200 regional newspapers; Primary Search, with more than 60 magazines for elementary school searches; HAPI (the Hispanic American Periodical Index); and much more. Here you can save articles and citations, create a personal account, mark your search results, and add them to your folder. Other tools include citation generation, search alerts, and journal alerts.

Oxford University Press' Grove Art Online and Grove Music Online. Grove Art Online features the full text of all 34 volumes of the 1996 Grove Dictionary of Art, with annual additions of new material and updates to original entries. There are over 45,000 articles on the visual arts, 500,000 bibliographical citations, more than 40,000 links to images in galleries, libraries, and museums on the Internet, as well as over 100,000 images from the Bridgeman Art Library.

Grove Music Online represents an integration of the 29 volumes of the New Grove Dictionary of Music and Musicians (second edition), the New Grove Dictionary of Opera, and the New Grove Dictionary of Jazz (second edition). It includes biographies, articles, illustrations, sounds, and links. This database includes the new Listening to Grove, with music samples you can listen to with Sibelius's Scorch plug-in.

infoUSA's Reference USA. Reference USA contains detailed information on more than 12 million U.S. businesses, 102 million U.S. residents, and health-care providers. The database also contains information on Canadian businesses and residents (compiled from white and yellow pages); SEC information; federal, state, and municipal data; and numerous directories, trade journals, and newspapers. You can download data on businesses including name, address, phone number, number of employees, principals' names, sales figures, credit ratings, and more.

Mergent Online. Mergent Online offers the same detailed company analysis as its print series, but the online version lets you create and customize multiple company reports. You'll find business descriptions, histories, properties, subsidiaries, officers, and financial statements "as reported." The content includes company information and annual reports from organizations in the U.S. and abroad. A report can include financial highlights, profitability ratios, debt management, asset management, stock price, and valuation figures.

Learning Express Library. Learning Express Library (formerly Learn a Test) is a testing resource spanning multiple disciplines and age levels. It offers practice tests for nearly every academic group (from fourth grade and up) and trade groups. There are tests for basic skills in reading, writing, and math, as well as civil service tests, college preparation, graduate school entrance, military, real estate, and much more. Each test includes associated sections, such as reading comprehension, math, and practice tests.

Ebscohost's Searchasaurus and Gale/Thomson's InfoTrac Kids Edition. These kid-friendly reference databases contain magazine and journal articles as well as basic encyclopedia and dictionary resources. Their uncluttered interfaces, fun graphics, and easy search and topic links make navigation simple for children.

Use It or Lose It

Libraries spend your tax dollars to provide the best information available. In these troubled budgetary times, librarians carefully track database usage and, regretfully, cut useful resources when money is lacking. It's time to rediscover the library and boost those usage numbers. Become a member and don't forget to visit your library in person, too.

Janet Rubenking works in technical services at the Sheilds Library, University of California-Davis.

fr.: http://www.pcmag.com/article2/0,1759,1463177,00.asp

The Next Small Thing

By Tara Calishain
March 2, 2004

Going to a search engine site to look up information is so 20th century. Why not use a search toolbar instead? These little additions to your browser put a wealth of search options at your fingertips, and they make searching quick and easy. One caveat: Most of the toolbars require ActiveX, so you'll have to make sure your browser and firewall allow it. And most toolbars work only in Internet Explorer. But if you don't use IE, don't worry; we've got a couple of options for you, too.

Google Toolbars

Google's official toolbar is available at http://toolbar.google.com/. You'll be asked to pick a language, and then you can download the toolbar. When you install the toolbar, you'll have to agree to both the terms of use and to the toolbar's feature set. Advanced functions require that information about the site you're browsing be sent to Google. You can disable the functions during the installation process if you're concerned about your privacy.


Once you've installed the toolbar, you'll find a bunch of useful tools. You can search Google, of course. You can also block pop-ups or fill in forms with one click. As you visit sites, the Google ToolBar shows the PageRank calculated by Google for each page, giving you an idea of how popular it is.

If you don't use IE you can still get great Google functionality, thanks to a couple of IE-independent toolbar projects. Check out Googlebar (http://googlebar.mozdev.org) if you use Mozilla or Netscape. Be sure Software Installation is enabled (Preferences | Advanced | Software Installation), or the installation won't work.

If you've used the "official" version of Google's toolbar with IE, Googlebar will look familiar. In fact, it can provide all the information the official version can except PageRank. You can use the I'm Feeling Lucky function, search Google's regular services (Google Groups, News, Directory, Catalogs, and so on) as well as the specialty searches (such as U.S. Government or Microsoft).

Another terrific option works with just about any browser. GGSearch (www.frysianfools.com/ggsearch) isn't a toolbar; it runs as a small application that you launch independent of your browser. When you enter a search term, a Google results page opens in your browser.

You can search several Google properties, including news, groups, images, and stocks. And you can choose options such as the relevance filter, safe search, and the time period to use for the search. Unless you use IE, choose the Enable custom browser option, because otherwise the default browser (usually IE) will open with the results.

Other Search Toolbars

Of course, Google isn't the only search engine, and others also offer toolbars.


You can download the Teoma toolbar at http://sp.ask.com/docs/teoma/ toolbar . The default install puts three items on the toolbar: a query box, a Highlight button, and a button for e-mailing a page to a friend. The query box works as you'd expect. Enter a search term and you get a Teoma page full of results. Hit the Highlight key in the toolbar and all the query terms on the page are highlighted instantly. If you click on the Email this Page to A Friend option, you'll get a pop-up box that you can use to send the title and URL of a page to an e-mail address. (Be aware, though, that pop-up blockers like the one on Google's toolbar may prevent the box from appearing.) Clicking on the Teoma button at the left of the toolbar lets you add the Search Dictionary button, and you can also change highlight colors and button styles.

The default installation of the Ask Jeeves toolbar (http://sp.ask.com/docs/toolbar/) shows a few options such as the query box, highlight tool, and news search. Click on the Ask Jeeves logo and you can add buttons such as AJ Kids, Dictionary, Stocks, and Weather. The coolest option is the Zoom button, which lets you shrink the size of the Web page you're viewing so you can print it on fewer pages.

You can download AltaVista's toolbar at www.altavista.com/toolbar/default. Default buttons include an Information lookup button (for currency conversions, a dictionary, ZIP and area codes, and so forth) a translation tool, and a pop-up blocker. Additional buttons (accessible by clicking on the AltaVista icon and choosing Add/Remove Buttons from the menu) include individual lookup buttons for many of the reference items as well as for many AltaVista properties (including news, images, and so on.)

If you want to do all your searching in one place, try the metasearch engine Dogpile's toolbar (www.dogpile.com/info.dogpl/tbar). Besides returning quick results from the top search engines, the toolbar gives you a news ticker, local weather, and Dogpile's Cursor Search, which lets you select any word or phrase on a page and then right-click to search.

Adding a toolbar can save you lots of time and make your searches easier. Adding a few—if you can handle the clutter—puts a variety of cool options right where you need them.

fr.: http://www.pcmag.com/article2/0,1759,1523364,00.asp

IE toolbars offer more than mere searches

By Anick Jesdanun, Associated Press

NEW YORK — Search engine toolbars for the Internet Explorer browser have become nearly essential tools online: They can block pop-up ads, alert you to new e-mail, even protect you from scams.
You'd need a half-dozen to combine all the best features, the Internet equivalent of leaving home in the morning with six different wallets.

So to narrow the choice I tested 11 — two of which, from America Online and EarthLink, debuted Monday.

Toolbars from the search leaders — Google, Yahoo, Microsoft and AOL — all have decent pop-up blockers that kill windows I don't want (ads) and permit ones I request (shipping details, for instance). And they're all a snap to download and install.

The Google Toolbar includes an extremely useful feature for frequent online shoppers. It automatically fills out online forms, such as name and address. A password protects stored credit card information. And if you keep a Web journal using Blogger software, which Google bought last year, you can add entries from the toolbar.

A small bell appears on the Yahoo Companion Toolbar when users of Yahoo e-mail accounts have new messages. You can also access bookmarks of favorite Web sites that you've stored on Yahoo. What I like best about the Yahoo toolbar is its portability. Settings are stored online, so you can customize it or add bookmarks wherever you are.

As for Microsoft, you can launch Hotmail and Messenger from its MSN Toolbar, but there's nothing special once you're there.

The AOL Toolbar displays the number of mail messages you have — if you're logged on already through AOL's regular software.

Yahoo outperforms MSN and AOL by allowing sign-ins from the toolbar, but Google outshines all of them. I found its search engine and extra features most useful. It even has a green bar that shows the relative popularity of the site you're visiting.

Among the rest, toolbars from Dogpile and Ask Jeeves have decent pop-up blockers. AltaVista and EarthLink make mistakes recognizing legitimate pop-ups, and Alexa keeps popping up annoying prompts asking whether I want that pop-up or not.

Dogpile, Ask Jeeves and Alexa have buttons for mailing Web links to a friend. Alexa's was the best.

Beyond that, each has its own handy features:

• Dogpile supports an emerging technology called Really Simple Syndication, or RSS. With it, headlines from your favorite Web journals, news or other RSS-enabled sites scroll across the toolbar.

• Ask Jeeves lets you shrink and expand entire Web pages — not just their text in the more limited manner of the Internet Explorer browser.

• Alexa, owned by Amazon.com, suggests related sites, as in "People who visit this page also visit ..." It's similar to Amazon's shopping recommendations. And should a Web site disappear, a copy at the Internet Archive may be reachable with the click of a button.

• AltaVista has a button for translating text to and from other languages. With it, you might get at the gist of what's going on.

• EarthLink blocks fake EarthLink, eBay and PayPal sites that try to steal your passwords or credit card numbers. It also searches your computer for malicious programs known as "spyware," though to remove them, you need an EarthLink account.

Among these features, EarthLink's ScamBuster is the most promising, especially once sites identified by the anti-spam company Brightmail are added next month.

I also tried GuruNet, which helps you cull useful information from the junk by narrowing results to reference materials like encyclopedias and maps. It reduces clutter. But unlike the others I tested, GuruNet requires a cash outlay — $29.99 a year.

I like Dogpile's RSS scroller and Alexa's recommendations. In both cases, there's a trade-off. They collect information on surfing habits, so read their privacy policies carefully.

Most people will probably be fine with either the Google or Yahoos toolbar, unless there's a specific feature they'd use a lot — perhaps because you frequent foreign sites (in which case you should download AltaVista's as well) or have poor eyesight (use Ask Jeeves).

Sadly, however, all these toolbars work only on Windows computers running Internet Explorer. For other browsers, try GGSearch, a standalone application that looks like a toolbar. You get basic Google searches but not extras like pop-up blockers.

Or use Opera's browser, as I do. It has the Google search box and pop-up blocker built-in, though its form filler isn't as good.

fr.: http://www.usatoday.com/tech/techreviews/products/2004-04-19-toolbars_x.htm

Google faces renewed assault

Reuters

LONDON Google's free e-mail service, Gmail, came under fresh fire on Monday, when an international privacy rights group said the service, which has yet to be rolled out, violated privacy laws across Europe and elsewhere.
.
Privacy International, which has offices in the United States and Europe, said it filed complaints with privacy and data-protection regulators in 17 countries, from Europe to Canada and Australia.
.
The group's initial complaint, filed in Britain, has been rejected.
.
Google said Gmail, which is still being tested, complies with data-protection laws worldwide.
.
The user terms, however, sparked an outcry among consumer advocacy groups and some Internet users because Google said its computers would scan e-mails for keywords to use in sending Gmail users targeted advertisements. It would also keep copies of e-mails even after consumers had deleted them.
.
Privacy International said these and other terms breach EU privacy laws, which are stricter than privacy laws in the United States.
.
"We look forward to a detailed dialogue with data-protection authorities across Europe to ensure their concerns are heard and resolved," a statement from Google said. LONDON Google's free e-mail service, Gmail, came under fresh fire on Monday, when an international privacy rights group said the service, which has yet to be rolled out, violated privacy laws across Europe and elsewhere.
.
Privacy International, which has offices in the United States and Europe, said it filed complaints with privacy and data-protection regulators in 17 countries, from Europe to Canada and Australia.
.
The group's initial complaint, filed in Britain, has been rejected.
.
Google said Gmail, which is still being tested, complies with data-protection laws worldwide.
.
The user terms, however, sparked an outcry among consumer advocacy groups and some Internet users because Google said its computers would scan e-mails for keywords to use in sending Gmail users targeted advertisements. It would also keep copies of e-mails even after consumers had deleted them.
.
Privacy International said these and other terms breach EU privacy laws, which are stricter than privacy laws in the United States.
.
"We look forward to a detailed dialogue with data-protection authorities across Europe to ensure their concerns are heard and resolved," a statement from Google said.

fr.: http://www.iht.com/articles/515986.html

Managing Search Marketing Campaigns

By Laura Thieme, Guest Writer
April 20, 2004
While search engine paid placement campaigns can be immensely profitable, effective bid management can be time-consuming and can quickly become a drawn out game of chess, or tug of war, depending on your rules of engagement.

A special report from the Search Engine Strategies 2004 Conference, March 1-4, New York.

Choosing a price or bidding position is relatively easy, especially if you want to be within the top three positions. But can you save money by more effectively managing positions, ad creative, and keyword choices? How do you know which terms were getting the best ROI?

Speakers on the Ad Management Case Studies panel answered these and other questions. The panel featured both vendors and users of ad management tools. Users presented their reasons for choosing the vendor solution, and the vendors had the ability to describe their basic features, pricing, and answer specific questions during Q&A.

Presenting case studies were: Mondy Beller, of Shoes.com, Eric Neuner from Netscape/WebMD and Jeff Landers from Offices2Share. Vendor representation included Dave Carlson of GoToast, now a subsidiary of eonMedia, a do-it-yourself bid management and tracking solution for agencies or in-house online marketing managers, Greg Byrnes of SendTraffic.com, and Kevin Lee of Did-it.com. The latter two represented full-service consultative approaches that included their own in-house tracking tools.

Using GoToast, you may find bid management rules and tracking return on investment (ROI) a little more difficult than with the other products. You'll need to learn the bidding jargon and various rules. For ROI tracking, you'll need to append some javascript code to your site, and set up tracking URLs. It helps to have training from GoToast. BidRank, a desktop bid management solution that excludes Google, may also require some basic training.

That's the added value of a full-service approach from Did-it and SendTraffic.com. "I don't have time to worry about the latest bid rate," said Jeff Landers from Offices2Share. "I used to be afraid to leave for lunch, because I'd worry that someone would outbid me during that time." With a full-service solution, a business owner does not have to worry about learning rules-based bidding, appending javascript to his web site or making sure ROI tracking is working. Did-it.com and SendTraffic.com are good choices for the company who wants to focus on fulfilling customer orders and running the business itself.

For other firms, a hands-on approach to managing a search marketing campaign is the only way to go. Mondy Beller of Shoes.com ultimately decided that no one knew her business better than someone internally. Shoes.com, an online retailer, sells 150 brands, 2500 shoe styles and maintains over 30,000 shoes in stock. They do a combination of organic search engine optimization, paid placement, paid inclusion, shopping comparison, affiliate and email marketing. Shoes.com is managing over 1,000 keywords, and is preparing to add another 500.

The biggest challenge for Shoes.com was constantly changing inventory, fast moving brands, and a desire to limit overseas clicks, since they only ship to North America. Beller noted that the craze for Ugg boots over the previous holidays caused their inventory to run out within hours of posting. To maintain a positive ROI, Shoes.com needed a robust tool to quickly remove ads to keep the site from attracting visitors that would not be able to purchase out of stock items.

Shoes.com wanted the ability to manage the program in-house, monitor keywords in real time, and quickly adjust keywords and bids as needed. The company uses GoToast's BidManager, ProfitBuilder and MasterList. BidManager enables rules-based bidding, automatically checks the status and position of bids, and makes appropriate bidding changes based on parameters.

Shoes.com no longer spends countless hours checking bids or pulling bids down at night. ProfitBuilder monitors paid placement keywords, but also keywords that customers use to find their site from other web sites, including organic search referrals. Using that information, the company identifies new keywords that can deliver additional traffic. It also tracks conversion rates by the referring site.

The MasterList provides a single interface where Shoes.com can manage all online search and product listings at Google, Overture, and other PPC providers. Using GoToast, Shoes.com reduced bid management time from three hours to one hour per day. The number of active keywords increased from 300 to 1,000. The company's cost to sales ratio decreased from 40 to 20%, while revenues more than tripled.

Jeff Landers from Offices2Share.com noted that bidding for keywords has become much more competitive over the past year. Landers was faced with an average CPC price increase of over 200% in less than one year. Using trafficPatrol and trafficROI tools from SendTraffic.com, Landers was able to match keywords to the value of an actual sale and revise online advertising strategy accordingly.

With ROI tracking, Offices2Share.com was able to prove that some keywords thought to perform better were actually not performing as well. For example, for their company, plural keywords did not perform at the same rate as the singular version. A phrase like "short term office space for rent" attracted more targeted visitors. As a result of employing bid management and ROI technology, lead volume increased while maintaining a profitable Cost Per Lead (CPL).

Eric Neuner of WebMD Medscape Health Network chose Did-it.com's Maestro CPC ROI tracking tool for bid management and reporting. Medscape is the largest online population of clinically active physicians. The company's goal is to acquire physician users at the lowest cost per order (CPO). Did-it reporting enabled measurement and more granular management, based on a specific cost per MD registration goal, yet required significantly less time to manage than the Medscape's previous tools. The company increased monthly clicks by 14%, increased registrations by 19%, decreased average cost per click by 26%, and cost per registration by 30%.

Using tools and services such as those described on this panel, it's easy to track ROI on every keyword phrase, regardless of whether you choose to do it yourself or have a full-service provider manage your ad campaign.

Laura Thieme is the president and founder of Bizresearch, a search engine optimization (SEO) and Web site traffic analysis company.

fr.: http://searchenginewatch.com/searchday/article.php/3342181

Cool search: A9.com

Search:

Search Inside the Book™: In addition to web search results we present book results from Amazon.com that include Search Inside the Book. When you see an excerpt on any of the book results, click on the page number to see the actual page from that book. (You will need to be registered at Amazon.com.)

Adjustable Columns: Simply drag the boundaries between the columns either to the left or the right to change their width of the different result sets (web, books, history). You can also close any column at any time. The next search will remember these new settings (if you allow cookies). This feature currently does not work on all browsers (but we're working on it!).

URL Short Cuts: At A9.com you can search directly from the browser URL box by typing:

a9.com/query

Search History: All your searches at A9.com are stored on our servers and shown to you at any time from any computer you use. Clicking on a link performs the search again. You can hide the window at any time and a password will be required to open it again. You can edit your history, for example, to clear an entry.

Click History: If any of the web search results include a site that you have seen before, it's marked on the result. We even tell you the last time you visited that site.

Site Info: Place the cursor on one of the Site Info buttons to see a lot more information about that site without leaving the search result page.

Web Search: Web search results are provided by Google.





A9 Toolbar:

Web Search: Search the web and Amazon.com's Search Inside the Book™ results. You can also do searches on Amazon.com, the Internet Movie Database, Google, and look up words in a dictionary and thesaurus.

Search Highlighter: The toolbar will automatically highlight your search terms in a light yellow. By using the highlighter menu, you can see how many times your search terms appear on the page, and jump to each occurrence of a specific word. Hint: You don't have to do a search to use the highlighter. Just type one or more words in the search box and click the highlight button.

Your History: Keep track of your last sites visited (on any computer) and your most recent searches. It will keep track of the Web pages you recently visited--even if you switch computers.

Diary: This is the newest and (we think) coolest feature of the toolbar. You can take notes on any web page, and reference them whenever you visit that page, on any computer that you use. Your entries are automatically saved whenever you stop typing or when you go to another page.

Site Info: See information about the website you are visiting, including related links, site statistics (including traffic rank), sites linking to this site, and user ranking. Select from the menu to go to the site's page on Amazon.com where you can get more information and write a review about the site.

Pop-up Blocker: Stop those annoying pop-up ads.

Search Inside the Book is a trademark of Amazon.com, Inc.

A9 and the A9 logo are trademarks of A9.com, Inc.

fr.: http://a9.com/-/company/whatsCool.jsp

亞馬遜推出搜尋引擎A9.com

CNET新聞專區:Stefanie Olsen  15/04/2004




亞馬遜公司(Amazon.com)悄悄推出使用者期待已久的搜尋引擎(目前推出測試版),似有向Google和雅虎的網路資料檢索工具向戰帖之意。

經過將近七個月的部署,亞馬遜子公司A9.com搜尋網站周三(14日)推出,標榜令人耳目一新的設計,讓使用者把在全球資訊網上搜尋到的結果加以整理、儲存,且能檢視搜尋紀錄、上亞馬遜找尋與查詢關鍵字相關的書籍資料。該公司同時宣傳一種搜尋工具列,兼具攔截彈出式廣告的功能。

A9發言人Alison Diboll說:「本公司希望豐富用戶的電子商務搜尋經驗,所以利用這個測試版向本公司用戶徵詢第一手的意見回饋。」她未說明新網站的試用期有多長,或正式版何時推出。

去年9月亞馬遜宣布成立A9時,即透露將發展購物搜尋技術,供自家內部和其他公司使用。但顯然這項計畫的企圖心已大為擴大。測試版A9的搜尋引擎提供網友不同的搜尋工具選擇。

A9的服務對亞馬遜的顧客以及網站註冊使用者開放。

總部在西雅圖的亞馬遜進入搜尋市場的時機,適逢競爭激烈、各家投注鉅資下注的當口。Google、雅虎、微軟和一群後起之秀,都競相改良搜尋技術,試著贏得網友的向心力,以爭奪廣告贊助搜尋市場錢景不可限量的商機。付費搜尋市場獲利可觀,是網路廣告成長最快速的市場,備受廣告主青睞。

付費搜尋營收讓遭遇達康泡沫衝擊的雅虎重新站穩腳步,今年第一季協助為該公司賺進1.01億美元的盈餘,遠超過分析師的預期。這項業務也促使各界紛紛揣測Google今年可望首次公開發行股票上市。Jupiter Research預測,今年付費搜尋市場可望從去年的16億美元增至21億美元,而且到2008年底可望以每年20%的複合成長率擴張。

更具體來說,購物搜尋已演變成一線網路公司必爭之地,因為那通常是網友選購產品或選擇服務之前的最後一個步驟。

有鑑於此,雅虎最近更新購物搜尋引擎,Google則在首頁為自家的電子商務引擎Froogle 大肆宣傳。亞馬遜為維護既得利益,必須取得搜尋主控權,因為購物搜尋終會推升網路商店的銷售額。

A9的總指揮是備受敬重的電腦科學家Udi Manber,曾任雅虎公司的技術長,兩年前加入亞馬遜,協助發展「書內搜尋」功能,讓使用者一窺部分書頁實際內容的電子版。去年夏天他移師A9部門,現在領導約20人組成的軟體工程師團隊。

啟動A9的搜尋技術不光是自製的,也包括Google、亞馬遜和亞馬遜另一子公司Alexa所研發的技術。不同於Google,A9在搜尋結果右側展示一欄可延伸的資料,點選後即打開與書籍相關的清單或個人過往的搜尋紀錄。此服務也展示Google贊助的刊登廣告。儲存在該公司伺服器上的資料甚至能告知使用者,曾到訪過哪些網站,和何時去逛的。(要查看個人化搜尋史,必須先註冊。)

A9的工具列讓使用者搜尋全球資訊網、亞馬遜網站、網際網路電影資料庫以及Google,也可查詢字義解釋。更新鮮的,是能記下所參觀網頁的心得,並且把這些筆記儲存起來,透過任一部電腦均可叫出。

使用者也可直接從瀏覽器網址列鍵入「a9.com/查詢關鍵字」。比方說,要查詢哈利波特一書,輸入「www.a9.com/harry potter」即可。(唐慧文)

fr.: http://taiwan.cnet.com/news/software/0,2000064574,20088953,00.htm