Home > Big Data, Data Analysis, Python > My Experiment with Python based Open source data visualization and analysis Tool – Orange (Part 1)

My Experiment with Python based Open source data visualization and analysis Tool – Orange (Part 1)


Install Orange 2.0b for Windows completely which will install pyMl, PythonWin, QT and several other Python libraries along with Orange.

http://orange.biolab.si/

With Orange you can access regular tab or comma delimited data or you can use C4.5 data which is separated into two separate file as *.data and *.names. Mainly Orange supports C4.5, Assistant, Retis, and tab-delimited (native Orange) data formats.

Loading Tab Delimited Data:

Launch PythonWin and have a Tab Delimited content to load it

>>> import orange
>>> print orange.version
2.0b (19:12:26, Feb 14 2012)
>>> data = orange.ExampleTable(“C:\Python27\lenses.tab”)
>>> print data.domain.attributes
<Orange.feature.Discrete ‘age’, Orange.feature.Discrete ‘prescription’, Orange.feature.Discrete ‘astigmatic’, Orange.feature.Discrete ‘tear_rate’>
>>>
>>> for i in data.domain.attributes:
… print i.name,
… print

age
prescription
astigmatic
tear_rate
>>> for i in range(3):
… print data[i]

['young', 'myope', 'no', 'reduced', 'none']
['young', 'myope', 'no', 'normal', 'soft']
['young', 'myope', 'yes', 'reduced', 'none']
>>> for i in range(5):
… print data[i]

['young', 'myope', 'no', 'reduced', 'none']
['young', 'myope', 'no', 'normal', 'soft']
['young', 'myope', 'yes', 'reduced', 'none']
['young', 'myope', 'yes', 'normal', 'hard']
['young', 'hypermetrope', 'no', 'reduced', 'none']

Loading C4.5 Delimited Data:

Launch PythonWin and have a C4.5  content to load it

>>> import os
>>> os.chdir(“c:/Python27/ascdata”)
>>> os.listdir(os.curdir)
['car.data', 'car.names', 'mydata.txt']
>>> car_data = orange.ExampleTable(“car”)
>>> print car_data.domain.attributes
<Orange.feature.Discrete ‘buying’, Orange.feature.Discrete ‘maint’, Orange.feature.Discrete ‘doors’, Orange.feature.Discrete ‘persons’, Orange.feature.Discrete ‘lugboot’, Orange.feature.Discrete ‘safety’>

>>> for i in range(10):
… print car_data[i]

['v-high', 'v-high', '2', '2', 'small', 'low', 'unacc']
['v-high', 'v-high', '2', '2', 'small', 'med', 'unacc']
['v-high', 'v-high', '2', '2', 'small', 'high', 'unacc']
['v-high', 'v-high', '2', '2', 'med', 'low', 'unacc']
['v-high', 'v-high', '2', '2', 'med', 'med', 'unacc']
['v-high', 'v-high', '2', '2', 'med', 'high', 'unacc']
['v-high', 'v-high', '2', '2', 'big', 'low', 'unacc']
['v-high', 'v-high', '2', '2', 'big', 'med', 'unacc']
['v-high', 'v-high', '2', '2', 'big', 'high', 'unacc']
['v-high', 'v-high', '2', '4', 'small', 'low', 'unacc']

About these ads
Categories: Big Data, Data Analysis, Python
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 36 other followers

%d bloggers like this: