DBAddr pgsql://foo:bar@localhost/rutubesearch/?dbmode=cache VarDir /usr/local/dpsearch/varrutube LogsOnly yes WrdFiles 32 URLDataFiles 32 StoredFiles 32 LocalCharset koi8-r LangMapFile langmap/en.ascii.lm LangMapFile langmap/en.utf-8.lm LangMapFile langmap/fr.latin1.lm LangMapFile langmap/fr.utf-8.lm LangMapFile langmap/ru.utf8.lm LangMapFile langmap/ru.utf8.lit.lm LangMapFile langmap/ru.koi8-r.lm LangMapFile langmap/ru.koi8-r.lit.lm LangMapFile langmap/en.iso-8859-15.lm LangMapFile langmap/en.windows-1252.lm LangMapFile langmap/fr.latin1.lit.lm LangMapFile langmap/fr.utf-8.lit.lm MinWordLength 1 MaxWordLength 64 URLInfoSQL no CheckInsertSQL yes MarkForIndex no CollectLinks no CacheLogWords 1024 CacheLogDels 2048 OptimizeAtUpdate yes GuesserBytes 32768 #LongestTextItems 32 Section body 1 0 Section title 2 128 Section crosswords 3 0 Section meta.keywords 4 256 Section sea 5 256 Section meta.description 6 256 Section ruid 0 64 "http://rutube\.ru/img/thumbs//([^\-]*)" "$1" Section ruid 0 64 "http://img\.rutube\.ru/thumbs//?([^\-]*)" "$1" Section ytid 0 64 "video_id=([^\'&]*)&" "$1" Section ytid 0 64 "video_id:'([^\']*)'" "$1" Section myvi 0 64 "&v=([0-9]+)&th" "$1" Section ruid 0 64 "&m=http%3a%2f%2ffs-([a-z])\.myvi" "$1" SEASentences 2048 SEASentenceMinLength 32 AddType text/html * DetectClones yes AccentExtensions yes Crosswords yes #HTTPHeader "User-Agent: DataparkSearch (+http://dataparksearch.org/bot) @ http://www.43n39e.ru/" Period 7d Realm hrefonly regex ^http://rutube.ru/tracks/index\.html\?p= Realm hrefonly regex ^http://rutube.ru/tracks/recent\.html\?p= Realm hrefonly http://www.youtube.com/categories* URL http://www.youtube.com/categories Period 180d Realm hrefonly regex ^http://rutube.ru/tracks/top\.html\?p= Realm hrefonly regex ^http://rutube.ru/tracks/date/any.html\?rd=any;p= Period 2h Server page hrefonly http://rutube.ru/tracks/index.html Server page hrefonly http://rutube.ru/tracks/rated.html Server page hrefonly http://rutube.ru/tracks/recent.html Server page hrefonly http://rutube.ru/tracks/top.html Server page hrefonly http://www.youtube.com/ Server page hrefonly http://www.youtube.com/google Server page hrefonly http://www.youtube.com/googleru Server page hrefonly http://www.youtube.com/user/AtGoogleTalks Server page hrefonly http://www.youtube.com/user/australia Server page hrefonly http://www.youtube.com/user/GoogleDeveloperDay Server page hrefonly http://www.youtube.com/user/google Server page hrefonly http://www.youtube.com/user/infonewsNZ Server page hrefonly http://www.youtube.com/user/PureNewZealand Server page hrefonly http://www.youtube.com/user/russiatoday Server page hrefonly http://www.youtube.com/user/TEDtalksDirector Server page hrefonly http://www.youtube.com/user/ucberkeley Server page hrefonly http://www.youtube.com/user/googletechtalks Period 12h Server page hrefonly http://rutube.ru/tracks/date/this_week.html Server page hrefonly http://rutube.ru/tracks/date/this_month.html Server page hrefonly http://rutube.ru/tracks/date/any.html Period 90d Realm regex ^http://rutube.ru/tracks/[0-9]+\.html Period 180d Realm regex ^http://www.youtube.com/watch\?v=[0-9a-zA-Z_]+$ ### disallow http://rutube.ru/login* disallow regex \.swf disallow http://myvi.ru/ru/flash/* http://tizer.mediarotator.ru/* http://www.youtube.com/player* Allow * # NoIndexIf regex body "i_delete\.gif" NoIndexIf regex body "