ENG/RUS   Main :: RiSearch :: RiSearch Pro :: RiCoord :: RiMap :: RiSearch PHP :: RiLax :: Forum


RiCoord

      This document describes general functionality of RiCoord library. RiCoord is a simple geocoding tool for your website. It consists of database of USA addresses with corresponding geographical coordinates. It can be used to find location of any given address or to calculate distance between two addresses. The database is based on Census TIGER/Line files. Full database for addresses geocoding takes about 500 Mb and consists of 30000000 street block coordinates. Streets intersections data takes about 400 Mb.

Installation

      RiCoord is standart Perl script and installation procedure is similar to any other Perl script installation:

  • Copy everything to your CGI-BIN directory, preserving subdirectory structure (datafiles should be uploaded to server in binary mode, Perl scripts in ASCII mode).
  • Set correct permissions for scripts (usually it is 755, but sometimes you need to set other permissions, check with your admins if you don't know this). This applies only to Unix-like operating systems.
  • Check first line of scripts. Usually it should be "#!/usr/bin/perl", but again check with your admins if you are not sure.
  • Run "find.pl" script, check webserver error-log file if something does not work.

      Database for each state is stored in separate file, therefore it is possible to upload on server only database for required state and locations in other states will be calculated using ZIP code average coordinates. Compact databases for selected counties or sities can be created for registered customers.

      Please note, that database does not hold coordinates of individual buildings, it is based on street blocks coordinates. Therefore you will get the same coordinates for all buildings within the same street block. If script can't find given address in database, it tries to return closest possible location (usually coordinates of nearest block). In case script can't recognise street name, approximate coordinates based on ZIP-code will be returned.

      Address matching is complex task with several sources of errors. User may not know exact address, for example he may ask "500 Main St", while exact address is "500 Main Ave", or "500 Main St NW" instead of "500 Main St NE". Another source of errors is ZIP code. TIGER database is updated every year, but ZIP codes are changed constantly. There is allways possibility that ZIP code in database is incorrect. Script will try to find approximate matches to some extent, returning error code and list of possible matches.

Scripts and libraries:

  • cgi/ricoord.pm - library for geocoding functions (returns geographical coordinates of given postal address and calculates distance between two points).
  • cgi/find.pl - demo script which shows how it can be used.
  • cgi/template.htm - demo template file.


Datafiles:
  • cgi/data_hash/* - files in this directory are used by RiCoord module for geocoding functions (each state in separate file, you may delete unwanted states).
  • cgi/data_hash2/* - files in this directory are used by RiCoord to find coordinates of intersections (each state in separate file, you may delete unwanted states).
  • cgi/data_hash3/* - files in this directory are used by RiCoord to find interpolated coordinates.
  • cgi/data_other/* - other files used by module.

RiCoord module interface

      RiCoord module (file "ricoord.pm" for Perl and "ricoord.php" for PHP) consists of several functions:

      get_coord($addr,$zip,$state,$city) - main function. Takes four parameters:

  • $addr - address in form "500 Main St NW";
  • $zip - ZIP code as 5-digit code;
  • $state - state as two-letter code;
  • $city - city name.

      Function returns array of hashes (in most cases array will have only one element). Hash will have such keys:

  • code
  • error_msg
  • addr
  • zip
  • state
  • city
  • long
  • lat

      Code takes next values:

  • 1 - requested address was found in database.
  • 2 - street was found in database, but house number fall outside of known number ranges. One of the nearest street blocks is returned in this case.
  • 3 - if no house number was specified and street name was found in database, script calculates average coordinates of given street within given ZIP code.
  • 4 - street was not found, ZIP centroid is returned.
  • 5 - approximate match was found. Script may return several possible matches in this case.
  • -1 - fatal error, ZIP-centroids database is not found.
  • -2 - fatal error.

      Usage:

my @res  = get_coord($addr,$zip,$state,$city);
my $code = $res[0]->{code};
my $long = $res[0]->{long};
my $lat  = $res[0]->{lat};
or (for PHP version)
$res = get_coord($addr,$zip,$state,$town);
$code  = $res[0]['code'];
$long  = $res[0]['long'];
$lat   = $res[0]['lat'];

      Usually you need to specify address, ZIP code and state, to get correct coordinates. However, if ZIP code is not entered, script will use city name to get list of all ZIP codes for this city and search every ZIP code.

      get_coord_interpolated($addr,$zip,$state,$city) - this function is similar to get_coord, it has the same input and output parameters. Datafiles for this function are stored in directory "data_hash3" and contain full geometry for every street segment, therefore they take more hard disk space then data in "data_hash" directory, where only middle point for every segment is stored. This function returns address coordinates spread along street according to house number and possible address range for this segment and shifted to some amount to the left or right side of the street. If this function does not return succesful result, get_coord can be called to get approximate coordinates. Data in "data_hash3" directory is used only by get_coord_interpolated function and can be deleted if you don't need interpolated coordinates.

      parse_address($addr) - parses address as required for usage with "get_coord" function, converts everithing to lower case, replaces "Street" to "st", "Avenue" to "ave" etc. Function returns two scalars: house number and street name.

      Usage:

      my ($num, $street) = parse_address($addr);

Input: "1234 Main Ave North".
Output: "1234", "main ave n".

      calc_dist($long1,$lat1,$long2,$lat2) - calculates distance between two points. Returns distance in kilometers (multiply by 0.6214 to get miles).

      Usage:

      my $dist = calc_dist($long1,$lat1,$long2,$lat2);

      decimal2degrees($num) - converts coordinates from decimal format (used by module) into degrees. This can be usefull for output.

      Usage:

      my ($deg, $min, $sec) = decimal2degrees($num);

      my ($code, $state) = get_state_by_zip($zip) - returns state for given ZIP code.

      my ($code, $city) = get_city_by_zip($zip) - returns city name for given ZIP code.

      PHP version of RiCoord is practically identical to Perl version. You can place files "find.php", "ricoord.php" and "template.htm" in any directory, but don't forget to set variable "$data_dir" in "ricoord.php" to directory with datafiles.

Intersections

      RiCoord is able to find coordinates of intersection of two streets. In order to find intersection you need to write names of streets divided by ampersand in address field (like "Street1 & Street2") and provide ZIP code or city name and state code. This function uses separate set of datafiles located in directory "data_hash2".



http://risearch.org S.Tarasov, © 2000-2005