Page 1 of 2 12 LastLast
Results 1 to 10 of 15

Thread: How to search old chat logs and convert chat logs to a more readable format?

  1. #1
    Join Date
    May 2011
    Beans
    279

    How to search old chat logs and convert chat logs to a more readable format?

    Hi,

    I've been discussing story ideas with a friend of mine on Empathy and [since a partial upgrade last year disabled Empathy] on Pidgin. I can't search the old Empathy logs, for whatever reason, and can't reinstall Empathy because of dependencies issues. I want to find a practical way to convert these logs to another format which I can search and can more easily read.

    I plan on switching to Mint eventually, but I am using Ubuntu 11.04 right now, if it matters. I know it's not supported, I have disabilities and I find a version of Gnome 2 the most accessible desktop given my disabilities, and I have read that some of the fixes I've made are no longer usable in more recent versions.

  2. #2
    Join Date
    Jul 2007
    Location
    Poland
    Beans
    4,499
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: How to search old chat logs and convert chat logs to a more readable format?

    i don't use empathy so can you go to the empathy folder (probably ~/.empathy or ~/.config/Empathy or ~/.gconf/apps/empathy) and look in what format the archive is stored?

    as for disabilities, care to list the tweaks you made that don't work in newer releases?
    You can look into Mate, it's a forked gnome2 environment that is not abandoned like its predecessor. Afaik it can be easily installed via PPA.
    if your question is answered, mark the thread as [SOLVED]. Thx.
    To post code or command output, use [code] tags.
    Check your bash script here // BashFAQ // BashPitfalls

  3. #3
    Join Date
    May 2011
    Beans
    279

    Re: How to search old chat logs and convert chat logs to a more readable format?

    Empathy uses .log format, Pidgin uses .html. For whatever reason, the desktop search won't touch either.

    As far as the fixes go: I installed a patch to turn off touchpad tapping and gestures, and the patch works in Gnome 2 but didn't work in Xfce or KDE. I installed joystick software. I disabled the pop-up scrollbars, and this restored regular scrollbars. I widened the scrollbars. I don't recall exactly how I did all this. I understand that in the newer versions, disabling pop-up scrollbars no longer restores regular scrollbars.

  4. #4
    Join Date
    Jul 2007
    Location
    Poland
    Beans
    4,499
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: How to search old chat logs and convert chat logs to a more readable format?

    but what the .log format means exactly? if i wanted to write a script i'd have to know how to parse it. If i had any .log file i'd look myself but i don't, so...
    Is it something like USER(TIME): message ?

    I disabled the pop-up scrollbars, and this restored regular scrollbars. I widened the scrollbars. I don't recall exactly how I did all this. I understand that in the newer versions, disabling pop-up scrollbars no longer restores regular scrollbars.
    that's not true, you can still disable overlay scrollbars and get oldschool bars spanning whole width/height of the window.
    http://askubuntu.com/questions/34214...lay-scrollbars
    i just tested it in vboxed 12.10 and i have 'normal' scrollbars right before my eyes.
    Btw, i'd imagine it's actually easier to work with huge buttons that show up right under the cursor and require less precision than with rather slim bars.

    You should give latest releases a spin in virtualbox or on a separate test partition so you can stay in touch and test/break newer things without risking your tried and true setup, though i understand not everybody has time/patience to do that.
    if your question is answered, mark the thread as [SOLVED]. Thx.
    To post code or command output, use [code] tags.
    Check your bash script here // BashFAQ // BashPitfalls

  5. #5
    Join Date
    May 2011
    Beans
    279

    Re: How to search old chat logs and convert chat logs to a more readable format?

    Well, I never had any luck with anything that's supposed to pop up when needed; these wouldn't pop up when I was trying to use them, and would pop up when I was trying to click on something else in the same area. I also tried Unity, but it didn't work for me. Newer versions might work better, but I understand it's designed around alt-tab to switch windows, and similar key combinations to search, and these are painful for me.

    I don't know about vboxed, but I don't have the disk space on this partition for anything fancy.

    I don't know how to answer your question about .log files. They open in the text editor/code editor.

    <?xml version='1.0' encoding='utf-8'?>
    <?xml-stylesheet type="text/xsl" href="log-store-xml.xsl"?>
    <log>
    <message time='20110611T11:55:33' id='[id]' name='[name]' token='[string of characters]' isuser='false' type='normal'>[message]</message>
    </log>

  6. #6
    Join Date
    Jul 2007
    Location
    Poland
    Beans
    4,499
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: How to search old chat logs and convert chat logs to a more readable format?

    so it's just a xml file. It would be trivial to whip up a script that would convert these logs to plaintext or something. Would the "TIME NAME: MSG" format be ok?
    What is the exact location of these .log files and does their naming convention say something about the chronology?
    Last edited by Vaphell; March 13th, 2013 at 06:03 AM.
    if your question is answered, mark the thread as [SOLVED]. Thx.
    To post code or command output, use [code] tags.
    Check your bash script here // BashFAQ // BashPitfalls

  7. #7
    Join Date
    Jul 2011
    Location
    South-Africa
    Beans
    678
    Distro
    Ubuntu 12.04 Precise Pangolin

    Re: How to search old chat logs and convert chat logs to a more readable format?

    Hay,

    Jumping the gun on the OP.

    If he used pidgin to chat on the empathy network logs should be located:
    ~/.purple/logs/**Protocol**/**account_conversation_with**
    eg:
    ~/.purple/logs/Empathy/Bob123/*

    Also the files are usually named by date in format:
    yyyy-mm-dd.hhmmss+gmt_offset.*
    eg:
    2011-01-05.221539+0200SAST.html
    That means:
    On 22:15:39 +02:00 GMT Chat was initialized on 2011-01-05


    I have parsed most protocols on pidgin for study purposes of words most used etc. html2txt is my first choice.
    Pidgin can be configured to use xml/html or text formating for logs. xml/html being default (on my computer atleast).

    An example tree will look like this:
    ~/.purple/logs/
    Code:
    .
    ├── facebook
    │   └── user@yahoo.com
    │       └── 1417444932
    │           └── 2010-09-11.184410+0200SAST.html
    ├── irc
    │   ├── user@irc.freenode.net
    │   │   ├── chanserv
    │   │   │   └── 2012-09-16.170457+0200SAST.html
    │   │   └── #freedroid.chat
    │   │       └── 2012-09-16.170457+0200SAST.html
    │   └── user@irc.ubuntu.com
    │       ├── #apt-get.chat
    │       │   └── 2011-03-29.190034+0200SAST.html
    │       ├── #bash.chat
    │       │   ├── 2011-03-19.190444+0200SAST.html
    │       │   └── 2011-06-11.123456+0200SAST.html
    │       ├── chanserv
    │       │   ├── 2011-03-11.112919+0200SAST.html
    │       │   ├── 2011-03-11.122720+0200SAST.html
    │       │   ├── 2011-03-19.184624+0200SAST.html
    │       │   ├── 2011-03-19.184759+0200SAST.html
    │       │   ├── 2011-03-19.190445+0200SAST.html
    │       │   ├── 2011-03-29.140157+0200SAST.html
    │       │   ├── 2011-03-29.155033+0200SAST.html
    │       │   ├── 2011-03-29.184053+0200SAST.html
    │       │   ├── 2011-03-29.200023+0200SAST.html
    │       │   ├── 2011-06-11.122704+0200SAST.html
    │       │   ├── 2011-06-11.123457+0200SAST.html
    │       │   ├── 2011-06-11.173500+0200SAST.html
    │       │   ├── 2011-07-02.205535+0200SAST.html
    │       │   ├── 2011-07-02.211222+0200SAST.html
    │       │   ├── 2011-07-03.154857+0200SAST.html
    │       │   ├── 2011-07-03.155228+0200SAST.html
    │       │   ├── 2011-07-18.175911+0200SAST.html
    │       │   └── 2011-12-07.103202+0200SAST.html
    │       ├── #cubase6.chat
    │       │   └── 2012-02-19.121338+0200SAST.html
    │       ├── #cubase.chat
    │       │   └── 2012-02-19.121350+0200SAST.html
    │       ├── ##cubasedaw.chat
    │       │   └── 2012-02-19.121653+0200SAST.html
    │       ├── #cubasedaw.chat
    │       │   └── 2012-02-19.121655+0200SAST.html
    │       ├── floodbot1
    │       │   └── 2011-03-29.193455+0200SAST.html
    │       ├── frigg
    │       │   ├── 2011-03-11.112904+0200SAST.html
    │       │   ├── 2011-03-11.192621+0200SAST.html
    │       │   ├── 2011-03-19.184612+0200SAST.html
    │       │   ├── 2011-03-19.184744+0200SAST.html
    │       │   ├── 2011-03-19.202154+0200SAST.html
    │       │   ├── 2011-03-29.140143+0200SAST.html
    │       │   ├── 2011-03-29.155024+0200SAST.html
    │       │   ├── 2011-03-29.184044+0200SAST.html
    │       │   └── 2011-03-29.195957+0200SAST.html
    │       ├── #httpd.chat
    │       │   └── 2011-07-18.175441+0200SAST.html
    │       ├── ioria
    │       │   └── 2011-03-29.200047+0200SAST.html
    │       ├── #kdenlive.chat
    │       │   └── 2011-07-03.201216+0200SAST.html
    │       ├── #nvidia.chat
    │       │   └── 2011-03-11.122428+0200SAST.html
    │       ├── ##programming.chat
    │       │   └── 2011-03-19.190218+0200SAST.html
    │       ├── #sdl.chat
    │       │   ├── 2011-07-02.211221+0200SAST.html
    │       │   └── 2011-07-03.155227+0200SAST.html
    │       ├── ##sed.chat
    │       │   ├── 2011-06-11.123015+0200SAST.html
    │       │   ├── 2011-06-11.130423+0200SAST.html
    │       │   └── 2011-06-18.184012+0200SAST.html
    │       ├── #ubuntu.chat
    │       │   ├── 2011-03-11.112917+0200SAST.html
    │       │   ├── 2011-03-11.122717+0200SAST.html
    │       │   ├── 2011-03-19.184622+0200SAST.html
    │       │   ├── 2011-03-19.184757+0200SAST.html
    │       │   ├── 2011-03-29.140153+0200SAST.html
    │       │   ├── 2011-03-29.155029+0200SAST.html
    │       │   ├── 2011-03-29.184051+0200SAST.html
    │       │   ├── 2011-03-29.200000+0200SAST.html
    │       │   ├── 2011-06-11.122703+0200SAST.html
    │       │   ├── 2011-07-02.205529+0200SAST.html
    │       │   ├── 2011-07-03.154854+0200SAST.html
    │       │   └── 2011-07-18.175910+0200SAST.html
    │       ├── ##unavailable.chat
    │       │   └── 2011-06-11.173453+0200SAST.html
    │       └── #warzone2100-games.chat
    │           └── 2011-12-07.103201+0200SAST.html
    ├── jabber
    │   └── user@gmail.com
    │       ├── user@mxit.co.za
    │       │   └── 2010-07-05.214519+0200SAST.html
    │       ├── user@gmail.com
    │       │   ├── 2012-11-13.212606+0200SAST.html
    │       │   ├── 2012-11-14.201124+0200SAST.html
    │       │   ├── 2012-11-16.201558+0200SAST.html
    │       │   ├── 2012-11-16.220343+0200SAST.html
    │       │   ├── 2012-11-17.202557+0200SAST.html
    │       │   ├── 2012-11-17.224315+0200SAST.html
    │       │   ├── 2012-11-18.192944+0200SAST.html
    │       │   ├── 2012-11-20.212047+0200SAST.html
    │       │   ├── 2012-11-26.175254+0200SAST.html
    │       │   ├── 2012-11-26.182303+0200SAST.html
    │       │   ├── 2012-12-24.162313+0200SAST.html
    │       │   └── 2012-12-24.181217+0200SAST.html
    │       └── user@groupchat.google.com.chat
    │           └── 2011-10-20.215117+0200SAST.html
    └── tree
    
    32 directories, 77 files
    Switched away from windows XP to Ubuntu 9.04. Never turned around to look back.

  8. #8
    Join Date
    May 2011
    Beans
    279

    Re: How to search old chat logs and convert chat logs to a more readable format?

    I have a backup folder full of the logs. It's just that they're a pain to read and impossible to search with the built-in desktop search. I don't have the processing power or spare disk space on this partition for special search software.

  9. #9
    Join Date
    Jul 2011
    Location
    South-Africa
    Beans
    678
    Distro
    Ubuntu 12.04 Precise Pangolin

    Re: How to search old chat logs and convert chat logs to a more readable format?

    Hay,

    Have you tried opening it in a web browser?

    Or simply try html2text on one file and see the result:
    Code:
    sudo apt-get install html2text
    cat /path/to/file | html2text
    if you like the result you can simply change all files in a directory to the alternative formating:
    Code:
    for file in ./*.log; do html2text "$file" > "$file.txt"; done
    Note this might produce output with:
    "That&apos;s" instead of "That's"
    "&quot;anything and everything&quot;" instead of ""anything and everything""
    etc.

    But it is much more readable.

    Cheers and good luck.
    If this is not what you want, feel free to accept better advice from Vaphell as I am sure he has much more experience than me in scripting
    Switched away from windows XP to Ubuntu 9.04. Never turned around to look back.

  10. #10
    Join Date
    Jul 2007
    Location
    Poland
    Beans
    4,499
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: How to search old chat logs and convert chat logs to a more readable format?

    quick and dirty scripting job that should convert all .log files to plain text in DAY HOUR NAME: MESSAGE format (files are created next to the .log files). In case you want some other format, point out which fields from that xml you want.
    If you need something more specific eg clumping all files of a given contact into 1 big file or moving converted files somewhere i can upgrade it but i would need more info about the directory structure and naming conventions to figure out proper solution.

    empathy_logs.sh:
    Code:
    #!/bin/bash
    
    logpath=.     # put proper path here
    
    while read -rd $'\0' f
    do
      echo "converting $f"
      python empathy_logs.py "$f" > "$f.txt"
    done < <( find "$logpath" -iname '*.log' -print0 )
    empathy_logs.py:
    Code:
    #!/usr/bin/env python
    
    import sys
    from xml.etree.ElementTree import parse
    
    xmldoc = parse( sys.argv[1] )
    root = xmldoc.getroot()
    
    for child in root:
      time = child.attrib["time"]
      time = time[0:4]+"-"+time[4:6]+"-"+time[6:8]+" "+time[9:17]
      name = child.attrib["name"]
      msg = child.text
      
      print "%s %s: %s" % ( time, name, msg )
    bash script is just a wrapper around the python script


    example with 2 dummy log files:
    Code:
    $ cat test*.log
    <?xml version='1.0' encoding='utf-8'?>
    <?xml-stylesheet type="text/xsl" href="log-store-xml.xsl"?>
    <log>
    <message time='20110611T11:55:33' id='[id]' name='[name1]' token='[string of characters]' isuser='false' type='normal'>some message</message>
    <message time='20110611T11:55:36' id='[id]' name='[name4]' token='[string of characters]' isuser='false' type='normal'>some other message &amp;&lt;&apos;</message>
    </log> 
    <?xml version='1.0' encoding='utf-8'?>
    <?xml-stylesheet type="text/xsl" href="log-store-xml.xsl"?>
    <log>
    <message time='20110611T11:55:33' id='[id]' name='[name1]' token='[string of characters]' isuser='false' type='normal'>abc</message>
    <message time='20110611T11:55:34' id='[id]' name='[name2]' token='[string of characters]' isuser='false' type='normal'>def</message>
    <message time='20110611T11:55:35' id='[id]' name='[name3]' token='[string of characters]' isuser='false' type='normal'>ghi</message>
    <message time='20110611T11:55:36' id='[id]' name='[name4]' token='[string of characters]' isuser='false' type='normal'>jkl</message>
    </log> 
    $ ./empathy_logs.sh 
    converting ./test1.log
    converting ./test2.log
    $ cat test*.log.txt
    2011-06-11 11:55:33 [name1]: some message
    2011-06-11 11:55:36 [name4]: some other message &<'
    2011-06-11 11:55:33 [name1]: abc
    2011-06-11 11:55:34 [name2]: def
    2011-06-11 11:55:35 [name3]: ghi
    2011-06-11 11:55:36 [name4]: jkl
    Last edited by Vaphell; March 14th, 2013 at 11:06 PM.
    if your question is answered, mark the thread as [SOLVED]. Thx.
    To post code or command output, use [code] tags.
    Check your bash script here // BashFAQ // BashPitfalls

Page 1 of 2 12 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •