About Sitellite       Screenshots       Downloads       Forge      Documentation       Community       Support

You are here: Home / Sitesearch indexing failing


Sitesearch indexing failing

Forum / Support Questions / Sitesearch indexing failing

Reply         Subscribe         Start new thread         Syndicated Feed (RSS)        

Displaying 1 to 6 of 6 Previous   1   Next
Author Message
banerjek

Posts: 18
Sitesearch indexing failing - Posted: April 29, 2008 - 10:54 AM Quote and reply
Just yesterday, we started experiencing a new problem which is preventing SiteSearch from indexing properly. Since this feature is used quite a bit, we're trying to get this back up asap.

When running the indexing routine, here are the errors we see:
_______________________________________

Killing SiteSearch with pid 3465
SiteSearch server started.
PHP Fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 37600257 bytes) in /home/httpd/htdocs/orbis/inc/app/sitesearch/lib/Extractors/PPT.php on line 7
PHP Warning: Unknown: open(/var/lib/php/session/sess_5qn0ojep0v8d3d9efkj02n22n2, O_RDWR) failed: Permission denied (13) in Unknown on line 0
PHP Warning: Unknown: Failed to write session data (files). Please verify that the current setting of session.save_path is correct (/var/lib/php/session) in Unknown on line 0
_____________________________________

I'm working with the local systems guys on the lack of permissions to the session directory, but it appears that processing involving powerpoint presentations is causing issues.

For sitesearch purposes, it is OK if powerpoint presentations (or uploaded documents for that matter) are ignored if the metadata are included in search.

To get straight to the point, if my primary goal is to get indexing of ordinary pages operational again asap, what is the best thing for me to do? Any advice would be appreciated. Thanks,

kyle

Back to top
lux

Posts: 645
Location: Manitoba, Canada
Re: Sitesearch indexing failing - Posted: April 29, 2008 - 11:30 AM Quote and reply
banerjek said:
Just yesterday, we started experiencing a new problem which is preventing SiteSearch from indexing properly. Since this feature is used quite a bit, we're trying to get this back up asap.

When running the indexing routine, here are the errors we see:
_______________________________________

Killing SiteSearch with pid 3465
SiteSearch server started.
PHP Fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 37600257 bytes) in /home/httpd/htdocs/orbis/inc/app/sitesearch/lib/Extractors/PPT.php on line 7
PHP Warning: Unknown: open(/var/lib/php/session/sess_5qn0ojep0v8d3d9efkj02n22n2, O_RDWR) failed: Permission denied (13) in Unknown on line 0
PHP Warning: Unknown: Failed to write session data (files). Please verify that the current setting of session.save_path is correct (/var/lib/php/session) in Unknown on line 0
_____________________________________

I'm working with the local systems guys on the lack of permissions to the session directory, but it appears that processing involving powerpoint presentations is causing issues.

For sitesearch purposes, it is OK if powerpoint presentations (or uploaded documents for that matter) are ignored if the metadata are included in search.

To get straight to the point, if my primary goal is to get indexing of ordinary pages operational again asap, what is the best thing for me to do? Any advice would be appreciated. Thanks,

kyle


Hi Kyle,

To get it running again quickly if you want to skip the indexing of the contents, you could just do a return ""; a the start of the extraction. The file is inc/app/sitesearch/lib/Extractors/PPT.php and you would put that just inside the process() method like this:

function process ($text) {
    return "";
    // the rest of the code
}


Let me know if that takes care of it.

Lux

Back to top View user profile     Contact this member
banerjek

Posts: 18
Re: Sitesearch indexing failing - Posted: April 29, 2008 - 3:10 PM Quote and reply
lux said:

To get it running again quickly if you want to skip the indexing of the contents, you could just do a return ""; a the start of the extraction. The file is inc/app/sitesearch/lib/Extractors/PPT.php and you would put that just inside the process() method like this:

function process ($text) {
    return "";
    // the rest of the code
}


Thanks for the quick response. I tried returning nothing from the process() method in PPT.php I'm not quite out of the woods yet. Content of FAQs appear to be indexed, but web pages are not. Here's what I'm getting when I try to reindex now:

_____________________________
Killing SiteSearch with pid 8011
SiteSearch server started.
FPDF error: Unable to find xref table - Maybe a Problem with 'auto_detect_line_endings'PHP Warning: Unknown: open(/var/lib/php/session/sess_5v7uegs4ce9ljo7jg238spb7c3, O_RDWR) failed: Permission denied (13) in Unknown on line 0
PHP Warning: Unknown: Failed to write session data (files). Please verify that the current setting of session.save_path is correct (/var/lib/php/session) in Unknown on line 0
_______________________________

Any ideas as to what might be happening? Thanks,

kyle

Back to top
banerjek

Posts: 18
Re: Sitesearch indexing failing - Posted: April 29, 2008 - 3:25 PM Quote and reply
Problem has been solved -- after I disabled docs and pdfs as well as ppts in the extractors, everything I needed came up. Everything is good, so thanks for pointing me in that direction

Back to top
lux

Posts: 645
Location: Manitoba, Canada
Re: Sitesearch indexing failing - Posted: April 29, 2008 - 3:27 PM Quote and reply
banerjek said:
lux said:

To get it running again quickly if you want to skip the indexing of the contents, you could just do a return ""; a the start of the extraction. The file is inc/app/sitesearch/lib/Extractors/PPT.php and you would put that just inside the process() method like this:

function process ($text) {
    return "";
    // the rest of the code
}


Thanks for the quick response. I tried returning nothing from the process() method in PPT.php I'm not quite out of the woods yet. Content of FAQs appear to be indexed, but web pages are not. Here's what I'm getting when I try to reindex now:

_____________________________
Killing SiteSearch with pid 8011
SiteSearch server started.
FPDF error: Unable to find xref table - Maybe a Problem with 'auto_detect_line_endings'PHP Warning: Unknown: open(/var/lib/php/session/sess_5v7uegs4ce9ljo7jg238spb7c3, O_RDWR) failed: Permission denied (13) in Unknown on line 0
PHP Warning: Unknown: Failed to write session data (files). Please verify that the current setting of session.save_path is correct (/var/lib/php/session) in Unknown on line 0
_______________________________

Any ideas as to what might be happening? Thanks,

kyle

I know the FPDF library hasn't been updated in a long time, but the only issue I've run into with it is that it chokes on encrypted PDFs and doesn't handle errors gracefully either unfortunately.

FPDI (http://www.setasign.de/products/pdf-php-solutions/fpdi/downloads/) is now at 1.2 although SiteSearch is still using 1.1, so you could try updating that in SiteSearch's lib/Ext/fpdi folder and see if that helps.

There's also the Zend_Pdf package from the Zend Framework that we'll likely move to using in SiteSearch 2 since that already uses ZF's search package, but ZF is PHP 5.1+ only, so we won't be back-porting that to SiteSearch 1 since that's being kept around for PHP 4 compatibility.

Let me know if an FPDI update fixes it. If not, trying Zend_Pdf is probably the next thing to do if you can.

Lux

Back to top View user profile     Contact this member
lux

Posts: 645
Location: Manitoba, Canada
Re: Sitesearch indexing failing - Posted: April 29, 2008 - 3:27 PM Quote and reply
banerjek said:
Problem has been solved -- after I disabled docs and pdfs as well as ppts in the extractors, everything I needed came up. Everything is good, so thanks for pointing me in that direction


Glad to hear!

Back to top View user profile     Contact this member
 

Sitellite 5 Beta


Copyright © 2008, SIMIAN systems Inc.
All rights reserved. Privacy policy
Some of the icons on this site were created by the Gnome Project.