WOW !! MUCH LOVE ! SO WORLD PEACE !
Fond bitcoin pour l'amélioration du site: 1memzGeKS7CB3ECNkzSn2qHwxU6NZoJ8o
  Dogecoin (tips/pourboires): DCLoo9Dd4qECqpMLurdgGnaoqbftj16Nvp


Home | Publier un mémoire | Une page au hasard

 > 

Techniques d'extraction de connaissances appliquées aux données du Web

( Télécharger le fichier original )
par Malika CHARRAD
Ecole Nationale des Sciences de l'Informatique, Université de la Manouba, Tunis - Mastère en informatique, Option : Génies Documentiel et Logiciel 2005
  

précédent sommaire suivant

Bitcoin is a swarm of cyber hornets serving the goddess of wisdom, feeding on the fire of truth, exponentially growing ever smarter, faster, and stronger behind a wall of encrypted energy

Troisième partie

Annexes

som_read_data

%Cette fonction est utilisée pour la lecture du fichier des données.

function sData = som_read_data(filename, varargin)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% Vérification des arguments

error(nargchk(1, 3, nargin)) % check no. of input args is correct

dont_care = 'NaN'; % default don't care string

comment _start = '#'; % the char a SOM _PAK command line starts with

comp_name_line = '#n'; % string denoting a special command line,

% which contains names of each component

label_name_line = '#l'; % string denoting a special command line,

% which contains names of each label

block_size = 1000; % block size used in file read

kludge = num2str(realmax, 100); % used in sscanf

% open input file

fid = fopen(filename);

if fid < 0

error(['Cannot open ' filename]);

end

% process input arguments

if nargin == 2

if isstr(varargin { 1 })

dont_care = varargin { 1 };

else

dim = varargin{1};

end

elseif nargin == 3

dim = varargin{1};

dont_care = varargin{2};

end

% if the data dimension is not specified, find out what it is

if nargin == 1 | (nargin == 2 & isstr(varargin{1}))

fpos1 = ftell(fid); c1 = 0; % read first non-comment line

while c1 == 0,

line1 = strrep(fgetl(fid), dont_care, kludge);

[l1, c1] = sscanf(line1, '%f ');

end

fpos2 = ftell(fid); c2 = 0; % read second non-comment line

while c2 == 0,

line2 = strrep(fgetl(fid), dont_care, kludge);

[l2, c2] = sscanf(line2, '%f ');

end

if (c1 == 1 & c2 ~= 1) | (c1 == c2 & c1 == 1 & l1 == 1)

dim = l1;

fseek(fid, fpos2, -1);

elseif (c1 == c2)

dim = c1;

fseek(fid, fpos1, -1);

warning on

warning(['Automatically determined data dimension is ' ... num2str(dim) '. Is it correct?']);

else

error(['Invalid header line: ' line 1]);

end

end

% check the dimension is valid

if dim < 1 | dim ~= round(dim)

error(['Illegal data dimension: ' num2str(dim)]);

end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% Lecture des données

sData = som_data_struct(zeros(1, dim), `name', filename);

lnum = 0; % data vector counter

data_temp = zeros(block_size, dim);

labs_temp = cell(block_size, 1);

comp_names = sData.comp_names;

label _names = sData.label_names;

form = [repmat(`%g' , [1 dim- 1]) `%g% [" \t]'];

limit = block_size;

while 1,

li = fgetl(fid); % read next line

if ~isstr(li), break, end; % is this the end of file?

% all missing vectors are replaced by value realmax because

% sscanf is not able to read NaNs

li = strrep(li, I_care, kludge);

[data, c, err, n] = sscanf(li, form);

if c < dim % if there were less numbers than dim on the input file line

if c == 0

if strncmp(li, comp_name_line, 2) % component name line?

Li = strrep(li(3 :end), kludge, I_care); I = 0; c = 1;

while c

[s, c, e, n] = sscanf(li, `%s%[" \t]');

if ~isempty(s), I = I + 1; comp_names{i} = s; li = li(n:end); end

end

if I ~= dim

error([`Illegal number of component names: ` num2str(i) ...

` (dimension is ` num2str(dim) `)']);

end

elseif strncmp(li, label_name_line, 2) % label name line?

Li = strrep(li(3 :end), kludge, I_care); I = 0; c = 1;

while c

[s, c, e, n] = sscanf(li, `%s%[" \t]');

if ~isempty(s), I = I + 1; label_names{i} = s; li = li(n:end); end

end

elseif ~strncmp(li, comment_start, 1) % not a comment, is it error?

[s, c, e, n] = sscanf(li, `%s%[" \t]');

if c

error([`Invalid vector on input file data line ` ...

num2str(lnum+1) `: [` deblank(li) `]']),

end

end

else

error([`Only ` num2strI ` vector components on input file data line ` ...

num2str(lnum+1) ` (dimension is ` num2str(dim) `)']);

end else

lnum = lnum + 1; % this was a line containing data vector

data_temp(lnum, 1 :dim) = data'; % add data to struct

if lnum == limit % reserve more memory if necessary

data_temp(lnum+1 :lnum+block_size, 1 :dim) = zeros(block_size, dim); [dummy nl] = size(labs_temp);

labs_temp(lnum+1 :lnum+block_size, 1 :nl) = cell(block_size, nl);

limit = limit + block_size;

end

% read labels

if n < length(li)

li = strrep(li(n:end), kludge, I_care); I = 0; n = 1; c = 1;

while c

[s, c, e, n_new] = sscanf(li(n:end), `%s%[^ \t]');

if c, I = I + 1; labs_temp{lnum, i} = s; n = n + n_new - 1; end

end

end end

end

% close input file

if fclose(fid) < 0, error([`Cannot close file ` filename]);

else fprintf(2, `\rdata read ok \n'); end

% set values

data_temp(data_temp == realmax) = NaN;

sData. data = data_temp( 1 :lnum,:);

sData. labels = labs_temp(1 :lnum,:);

sData.comp_names = comp_names; sData.label_names = label _names; return;

précédent sommaire suivant






Bitcoin is a swarm of cyber hornets serving the goddess of wisdom, feeding on the fire of truth, exponentially growing ever smarter, faster, and stronger behind a wall of encrypted energy








"Et il n'est rien de plus beau que l'instant qui précède le voyage, l'instant ou l'horizon de demain vient nous rendre visite et nous dire ses promesses"   Milan Kundera