Domino Query Language & domlog.nsf

L’adozione del Domino Query Language con la versione 10.0.1 può indubbiamente portare a grossi benefici in termini di prestazioni e permette di effettuare operazioni interessanti che prima non erano possibili.

Un piccolo ma interessante esempio lo voglio descrivere qui per dare l’idea di quello che si può fare.Uno problema che mi trovo spesso ad affrontare sulle installazioni di Domino presenti da tempo è la dimensione del log degli accessi web, il domlog.nsf.
Come senz’altro sapete se attivate questa opzione il server Domino inizia a registrare tutti gli accesso fatti via Http a delle risorse sul server. Tutti significa proprio tutti, anche i Css e le immagini caricate.
Questo porta inevitabilmente ad una crescita esponenziale del file che spesso viene dimenticato fino a quando ci si accorge che ha raggiunto dimensioni tali che lo rendono inutilizzabile.
Ho trovato però un modo per pulirlo regolarmente con l’aiuto del DQL e soprattutto con tempi di esecuzioni molto rapidi.
L’idea è semplicemente quella di fare una query sul domlog.nsf per localizzare tutti gli elementi che non mi interessano (come i file .jpg o .png) : poichè la query mi ritorna una NotesDocumentCollection posso poi fare un remove della collection stessa in modo da eliminare tutti i documenti.
Vediamolo in pratica (il codice è più sotto)

A parte le dichiarazioni standard il primo passo e di eseguire la dim relativa alla NotesDominoQuery.

Dopo questa mi creo un array che contiene tutti i tipi di documento che voglio ripulire e cioè le varie definizioni di ContentType che voglio eliminare dal Domlog.nsf.

Ovviamente ognuno si può creare questa array con i valori che preferisce a seconda delle necessità.

Una volta fatto questo ciclo l’array (tutta o solo una parte come in questo esempio); all’interno del ciclo mi costruisco la query che sarà in questo formato:

dqlTerm = “ContentType = ‘image/gif'”

Ogni query mi ritorna quindi una NotesDocumentCollection su cui applico il metodo .remove ed il gioco è fatto.

The adoption of the Domino Query Language starting in Domino 10.0.1 undoubtedly leads to big benefits in terms of performance and allows us to perform interesting operations that were not possible before.

I want to describe here a small but interesting example of ​​what can be done. One problem I often face on Domino installations is the size of the web access log, the domlog.nsf.
As you already know if you enable this option, the Domino server starts recording all access done via Http to any resource on the server. Any means really everything, even the CSS and the uploaded images.
This inevitably leads to an exponential growth of the file that is often forgotten until you discover it reached some Gbytes and it becomes unuseful.
But I found a way to clean it regularly with the help of the DQL with fast execution .
The idea is simply to query the domlog.nsf to locate all the elements not interesting (like the .jpg or .png files): since the query returns a NotesDocumentCollection I can then remove all of the collection documents quickly.
Let’s take a look at the snippet, the code is below.
Apart from the standard declarations the first step is to execute the dim relative to the NotesDominoQuery.

After this I create an array that contains all the types of documents that I want to clean up i.e. the various definitions of ContentType I want to delete from the Domlog.nsf.Obviously each one can create this array with the preferred/needed values ​​.

After this the routine starts cycling the array and, within this cycle, I build the query that will be in this format:

dqlTerm = “ContentType = ‘image / gif'”

Each query returns then a NotesDocumentCollection on which I apply the .remove method and the game is done.

Dim session As New NotesSession
Dim db As NotesDatabase
Dim col As NotesDocumentCollection
Dim dqlTerm As String
Dim clean(14) As string
clean(1) = "application/javascript"
clean(2) = "application/json"
clean(3) = "application/octet-stream"
clean(4) = "application/pdf"
clean(5) = "application/x-javascript"
clean(6) = "application/zip"
clean(7) = "image/gif"
clean(8) = "image/jpeg"
clean(9) = "image/jpg"
clean(10) = "image/png"
clean(11) = "image/x-icon"
clean(12) = "Text/css"
clean(13) = "Text/javascript"
clean(14) = "Text/json"

Set db = session.Getdatabase("","domlog.nsf", False)

For i = 6 To 12
  dqlTerm = "ContentType = '" + clean(i) + "'"
  Set dql = db.CreateDominoQuery()
  Set col = dql.Execute(dqlTerm)
  Print dql.Explain(dqlTerm)
  Call col.RemoveAll( true )
L’utilizzo dell’istruzione Print dql.Explain(dqlTerm) mi permette di avere il risultato dell’esecuzione della query : documenti processati, trovati e tempi di risposta come nelll’esempio qui sotto. Using the Print dql.Explain(dqlTerm) I can log on the console the results of the query: docs processed, found and execution time as in the example reported below.
Agent Manager: Agent printing:

Query Processed:
[ContentType = 'application/javascript']

0.ContentType = 'application/javascript' NSF document search estimated cost = 100
Prep 0.341 msecs, Exec 6482.914 msecs, ScannedDocs 155812, Entries 0, FoundDocs 1273



3 Commenti

  1. mynslog

    there are options for exclusions in all server documents 🙂 but this method is very cool

  2. Dragon

    It depends on your situation. For my systems, I need to check that the image/graphic was actually loaded from my site, and not by some other site using my images and resources.
    I wrote a script many years ago, that checks for images (and ensures that the referrer value was from the same site) then adds the entries to some stats (for load size info report etc.) and deletes the entry. It does the same for normal pages where somebody has followed a previous page.
    Bots (check the Agent name) get thrown into a separate stats entry. Anything with a HEAD entry that refers to another website gets listed (and IP blocked) as a scammer/spammer system.
    Graphics that are cross loaded get reported to see whether it was an image search or somebody has been stealing my content. Those who steal have a tendency to get 302’d to “disturbing” images in the future.
    Requests to malware names (magecart.js etc.) get flagged and reported into the security log.

    There is so much more to log files than just what pages and graphics get served. Properly used then can give you a good idea of how often your server is attacked, and used. I’m surprised that people don’t keep an eye on their log files more often.

    • Fabio Di Paola

      I agree with you: it depends on needs and situations. This article means to be an example about what we can do with DQL on our servers, it’s not absolute but should be tailored on different environment.
      I’ve got in situations (usually customers) where the domlog was more than 10 Gb and became unusuable. Rather to delete the file this could be a way to clean it in a quick way.