The Super Site Searcher - FAQ


What does my web server need for this program to work?

Check to see that your web server has Perl 5 with the LWP
module installed.   Also, if you are ordering the SQL
module as well, check to see that you have access to 
a database server like Msql or Mysql...  

Telnet access is usually a good thing but not necessary.
You need to be able to run a nightly cron job (timed event).
This cron job will update your database every night by
adding new urls and verifying old urls.   You can run
it yourself through a telnet session instead of on a nightly 
basis, if you need to.

I have found that web hosters don't like to change
their current setup for any given customer.
If you have no particular attachment to your web host
and they don't support what is needed, the best 
solution is to move to a web host that is 
guarenteened to support the script. It is not that
big a deal to move your site over.  And your current
web hoster is obviously not keeping up with the times,
if they lack these basic things.   If they don't have
this stuff, what else might they be lacking?

Go here for more info on switching

How do I install it on my web server?

First, you need to change the first line of
the .cgi files to reflect the location of
perl version 5 on your server:

#!/usr/bin/perl5

Be sure to leave the #! in front of the full path
to perl5.

Remember, on some servers, you need to have 
your cgi's in the cgi-bin.  You can tell if
this is the case, by installing the script in
another directory.  Then try to access the
script, if it just prints out the script
as text, then you have to put the script in the
cgi-bin.

Next, you need to be sure that the saves dir is available
and writeable.  To do this, you need to create the directory 
and chmod 777 'directory name'.  

You also need to chmod 755 the .cgi files.

Chmod's can be done through telnet or through an
ftp client that supports it.  Chmoding is generally
not necessary on NT servers.

After that, you just need to be sure that all of the
other config settings in site_searcher_config.cgi are correct.  Older
versions have the config settings at the beginning of
the site_searcher.cgi file.  

The next step is to get the url_adder.cgi to run on
a daily/monthly basis.  It is the script that actually
updates the database, by adding new urls (crawling
the sites, updating existing urls (deleting bad urls or
renewing indexed data), and just deleting urls that
you have specified for deletion.

First, you need to edit the maindir setting near the
top of the url_adder.cgi script.  That should be
set to the full path to your admin directory.  If
you are unsure of the full path, ask your web hoster.
It should start with either c: (for NT) or / (on Unix).

To activate this script, you need to activate a cron
job.  The mycron file contains the appropriate info
to get a cron job going except that you must edit
it to change the paths for your server.  Be sure
not to change the numbers, unless you understand
what they mean.    If you can telnet to your site,
you can type crontab mycron, in the directory where
mycron is, to activate it.   If you can't telnet in,
you must ask your web hosting company how to get
setup with a cron job.  Usually, it just involves
sending them the mycron file.

If you are installing on an NT server, you need
to setup a 'scheduled task' (i.e. cron job) instead.

Goto: Start - Programs - Admin Tools - Scheduled tasks

Then through the interface you can easily run the
url_adder.cgi script at a given time.  Just
set it up to run once a day late at night.   

Now, you should be all set to go!

How to I get the form emails to come to me?

The site_searcher_config.cgi file has a setting for this.

How do I change the number of urls returned by the engine?

Inside the site_searcher_config.cgi file there is a setting
for this.  You can set it to whatever number you wish.

How do I change the look of the template files?

Template files are generally stored in the html_templates 
directory.   

The first part of the template files is defining the 
look that each item in the current search return will 
have.  Everything below '--START OF EACH ITEM DEF--' and
above '--END OF EACH ITEM DEF--' is part of this definition.

There are other items that get defined for the 
various modules as well.  They will use the
same concept of an item def.

The variables, (*Variablename), are field names coming
from your database.   You can position them however
you like.  

Below that section, is just normal HTML.  That
is where you would adjust the look of the page,
like the background color and the look of the form...
This form of template is very flexible since it
simplifies everything to just another bit of
html.


How do I make a link to a specific search of my choice?

Run the actual search through your engine and then use
the url that is on the returned page.

Does it spider the site or just get the url specified?

It just gets the url title, description, keywords...
from the specified url.  Crawling to other urls
connected to the original url i.e. spidering, 
is not done.

Can I turn off the url verify and email?

Yes, those are easily adjustable config settings.

How much disk space does a 10,000 url database take up?

Approximately 5 megs.

Can I turn off the ranking system?

Yes, there is a config setting for this.  There is a 
speed advantage to doing this since the program doesn't
have to sort out the urls, it merely returns them in
'most recent first' order.

How do I adjust the ranking system?

There are config settings for this in the site_searcher_config.cgi
file.   Basically, its just a series of numbers which
get added to the rank of a site as that particular criteria
is matched.  Like if search term is in the title, add 1000.
These config settings allow you to adjust the ranking
to fit the way you would like it to work.  The default
settings that I have setup should be good enough for you
if you aren't interested in playing around with this 
feature right away.

How does the category for each url get set?

This can be sent in with your user's url submissions,
and can then also be adjusted via the online admin
area.  You can see this easily in the demo.

Can I put a search box on another web site?

Yes, it is just a standard web form so it
can be placed anywhere on the web, as long as you
use the full url, like http://... 

What is a cron job and what does this one do?

A cron job is just a way of letting the script
run every so often without you having to execute
it directly.   For this package, the url_adder
script runs once every day.  It can be run
less or more frequently depending on your 
preference.   

The url_adder takes all of the submitted urls
and grabs the pertinent data from them.   Then
adds it to the database.   Once a month, the 
url_adder checks all of the urls to be sure that
they are still valid.   

Will the site slow down while the cron job is running?

No, this cron job is designed to run without causing
any problems with your server load.   

What can I do if my web hoster doesn't allow cron jobs or whatever?

Some web hosting companies don't allow such things or has
older versions of Perl installed.  If this is the case, you
need to move to a new host in order for this script to work.
This is not only good for this package but for any future
packages that you wish to install.  There is no reason
to stay on a server that doesn't keep up with the 
current releases of everything.   

For better hosting, check out:

http://www.fatcow.net/
http://www.nwrks.net/
http://www.he.net/
http://www.hiway.net/

Can I customize the search results page with my background and banners?

Yes, the html template files allow you to redesign
the look of that page.