The Clarinet BBoard
|
Author: Chalumeau Joe
Date: 2008-02-10 05:04
A "text cloud" is a visual technique for depicting the frequency of words on a web site, document, speech (e.g., The State of the Union) and the like. Basically, the more frequently a word is used, the more "important" it is; the more important the word, the larger font in which it is displayed on the cloud.
Being thoroughly bored this evening, I looked at the past 30 messages posted to this BB between 2008-02-08 03:04 and 2008-02-10 03:42. Using an application from http://tagcrowd.com I obtained the cloud shown in the attached picture. Basically, it's a kind of indicator of what members of this board felt were important things to discuss during this time period.
Would my time have been better spent practicing my clarinet instead? (Yes.) Am I a pathetic geek loser for doing this? (Quite likely.) Is it fun? (Yeah, I think so.)
Enjoy (but don't take it too seriously).
Joe
Post Edited (2008-02-10 05:05)
|
|
Reply To Message
|
|
Author: butterflymusic
Date: 2008-02-10 06:22
Good to see "clarinet" rated the largest font size :D
.....it'd be interesting to see this done for over a period of say, a month or even a year, as you can see the most recent topics such as "ricardo" and "julian" had an impact where they might not if the selection were over a longer time. But that's just the analyst geek in me coming out. :D
This was kinda cool.
|
|
Reply To Message
|
|
Author: Chalumeau Joe
Date: 2008-02-10 06:25
Yes, it would be interesting to go back to the beginning of BB and perform the analysis, or at least have a weekly "moving average" of what's hot.
MARK C.: How about it?
|
|
Reply To Message
|
|
Author: Tom A
Date: 2008-02-10 09:42
"Youtube" and "hilarious" rate about the same. I wonder what we all find so funny.
Surprised to see that the word "listening" doesn't appear.
|
|
Reply To Message
|
|
Author: Don Berger
Date: 2008-02-10 15:56
Very interesting "concept?", C J, are distinctions made re: "?trivial?" words, such as the articles, the conjunctions? If its a "sheer" frequency of occurence, doesn't the "cloud" downrate many longer. highly-imporant words, such as our "polycyndrical, undercut, nickel-silver alloy, grenadilla-blackwood, chalameaux" so dear to our hearts ??? Sun AM thots. Don
Thanx, Mark, Don
|
|
Reply To Message
|
|
Author: Bob Phillips
Date: 2008-02-10 16:54
Mark,
make your analysis over shorter periods --so that we can see the evolution of our mutual neuroses
Bob Phillips
|
|
Reply To Message
|
|
Author: Chalumeau Joe
Date: 2008-02-10 17:01
Don,
The site I used has a number of customization options, such as the ability to remove certain words, group similar words, ignore common English words, e.g., "and", "of", "the", etc. that could bias the cloud. In my analysis, I stripped-off the words such as "Author", "Date", and most ISP-related information, since these are common to every message.
As you noted, there are some terms that may be more dear to some hearts (e.g., "grenadilla", "undercut").; they just weren't as dear at the time of the analysis (remember, I looked at a very brief snapshot of messages over the past two days, roughly 30 threads). Those other words may indeed pop-up one day with a more thorough (and continuously updated) analysis of our BB threads. That's why having some kind of historical and weekly moving average analysis might be useful.
Joe
|
|
Reply To Message
|
|
Author: Don Berger
Date: 2008-02-10 18:44
TKS, C J - I just wanted to know a bit more about this method of "text indexing" which is used widely in patent/literature searching, where I had a number of years of corporate experience. It would be useful in our BB archives searching in formulating search [Boolean?] questions, suggesting the use of specific terms rather than generics, likely. Comments, you other ?old? searchers ?? Don
Thanx, Mark, Don
|
|
Reply To Message
|
|
The Clarinet Pages
|
|