Ticket #99 (closed defect: fixed)

Opened 7 years ago

Last modified 7 years ago

Ezproxy logfile configuration only picking up first resource that is accessed in a EZP session

Reported by: steve@… Owned by: smartp@…
Priority: major Milestone: v1.1.0
Component: Raptor ICA Version: v1.0.0
Keywords: Cc: steve.pellow@…, jamie.denman@…

Description

The Raptor Ezproxy logfile parser is parsing our Ezproxy logfile, which is in the format as specified in the Raptor instructions. The way this is configured in Raptor ICA is that the parser only picks Ezproxy log file entries that contain the string 'connect?session'. This means that it only reads in the first site that you access in an Ezproxy session. If you go on to link to other Ezproxy authenticated sites in the same session, Raptor will ignore these Ezproxy logfile entries. You might go on to several sites during a typical user Ezproxy single sign on session.

A possible fix might be to add some logic that parses subsequent Ezproxy logfile lines that include the same Ezproxy Session-ID and a unique URL. For instance if a user looks at several pages of one resource there will be several Ezproxy logfile lines with that users session ID and the same base URL of the resource - Raptor would only need the first URL for any particular Ezproxy resource accessed for that user.

Attachments

EzpLogfileFragment.txt (3.7 KB) - added by steve@… 7 years ago.
Ezproxy logfile fragment

Change History

comment:1 Changed 7 years ago by smith@…

Are you sure? I thought it put in a connect? on every new resource the user accesses. I remember looking at this when we were defining the original ezproxy parsing logic.

Re: your suggestion, a single resource may have multiple URLs (think www.somewhere.com, images.somewhere.com, static.somewhere.com)...

Changed 7 years ago by steve@…

Ezproxy logfile fragment

comment:2 Changed 7 years ago by steve@…

If you look at the bit of the Ezproxy log file I've just uploaded (EzpLogfileFragment?.txt), you can see that the user first tries to access 'bergfashion'. After a couple of lines showing the authentication process, this gives a 'connect?session' line for bergfashion. During the same session he goes on to look at Artstor and then Gale Infotrac, under the same session ID, however, these don't include a 'connect?session'.

I take your point about the multiple URLS for the same resource - I'm not sure how you would deal with that other than factoring out the hostname bit (www / images /static) of the base url and leaving the domain part (somewhere.com)?

comment:3 Changed 7 years ago by smith@…

OK... think we know how to get this. We pick up the connect?session= as before, with user ID. We also look for login?url= but only where this is a user id (i.e. not a hash). Then we will be picking up the first login, and subsequent resources accessed...

Looks from our logs like that catches everything, and I'm pretty sure it does in the falmouth snippet attached as well.

comment:4 Changed 7 years ago by smith@…

  • Owner changed from Phil Smart to smartp@…
  • Status changed from new to assigned
  • Milestone set to v1.1.0

comment:5 Changed 7 years ago by steve@…

Hi,
Do you mean by hash the hyphens '-' that replaces the EZP session ID and user ID/principal name if they don't exist yet in the transaction ?

Just to clarify you would either look positively for lines where there is a 'login?url' and a user_id in the 3rd position or negatively eliminate lines with 'login?url' and hyphens in the 2nd and 3rd positions of the logfile.

I think that would work OK.

comment:6 Changed 7 years ago by smith@…

Oops, yes, when I said hash I meant hyphen.

Yes, you're right. lines with user_id in third position and connect?session=blah&url=blah seem to catch the first service someone logs into, and lines with user_id in third position with login?url=blah seems to catch subsequent ones.

comment:7 Changed 7 years ago by smith@…

  • Cc steve.pellow@…, jamie.denman@… added; steve.pellow@… removed
  • Status changed from assigned to closed
  • Resolution set to fixed

Fixed in v1.1.0

Note: See TracTickets for help on using tickets.