Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Open Source Linguistic Tools



 Hi Goutham, 

Very cool!  Yeah, I got asked to look into capturing key terms, quotes, and other pertinent info from media streams and so I started writing up a Python script, after a few more hacks it's a few hundred lines of Python.  So, short story, I demo it, and folks basically say, alright, go and try to add summarization, categorization a few other things.  I know I could do some LSI and tie that in but I wanted to check what other folks have written before staring at a screen for a week.   

Right now I'm translating my Python back to C, because I'll work with the WordNet libraries but I'd be happy to give you what I've got in about two (maybe three :) weeks time?   

Let me know! 

- Jared 


----- Original Message ---- 
From: goutham patnaik <[hidden email]> 
To: Kristian Erik Hermansen <[hidden email]> 
Cc: Jared Carlson <[hidden email]>; [hidden email] 
Sent: Wednesday, June 25, 2008 2:47:34 PM 
Subject: Re: Open Source Linguistic Tools 

hey jared, 

i wrote a c library as part of my research to do some topic modeling. it was based on the latent dirichlet allocation model. It worked well for our purposes but it needs some polishing if it needs to be released :) I intend to work on it this summer and release it soon...... 

In the meantime, where can I get a hold of your code? 




Goutham 


On Wed, Jun 25, 2008 at 9:23 AM, Kristian Erik Hermansen <[hidden email]> wrote: 

On Wed, Jun 25, 2008 at 9:52 AM, Jared Carlson <[hidden email]> wrote: 
> I was just wondering if anyone knew of, and liked, using some open source software for text summarization, classification, etc.  I wrote a quick Python script that does a decent job of identifying proper nouns, for names, organizations, and identifying sources, etc, but it's a little crude and I was just wondering if anyone else has played with something they liked?  Thanks for any suggestions... 

My buddy Goutham, a grad student now, was working on a C library for 
some AI/machine learning project a few years back.  I encouraged him 
to open source the code, but he thought no one would use it.  Now your 
email is my opportunity to convince him that he should give it away 
:-) Goutham, can you set up your library on sourceforge and then post 
a link back?  Thanks dude... 
-- 
Kristian Erik Hermansen 
-- 
CISSP, CEPT, CREA, CEH, Linux+, A+, QGCS, ACSA, this is getting ridiculous... 
http://kristian-hermansen.com


      
-- 
This message has been scanned for viruses and 
dangerous content by MailScanner, and is 
believed to be clean. 

_______________________________________________ 
Discuss mailing list 
[hidden email] 
http://lists.blu.org/mailman/listinfo/discuss
 


BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org