Hello and welcome to CertForums.co.uk, here we host free active certification forums with links to the best free resources for Microsoft's MCSA MCSE MCDBA Cisco's CCNA CCDA and CCNP, and CompTIA's A+ Network+ i-NET+ and Security+ certifications in the UK. If you wish to post or use other advanced features you will need to register first. Registration is absolutely free and takes only a few minutes to complete so sign up today!

If you have any problems with the registration process or your account login, please contact support

Go Back   CertForums > Computing Support Forums > Programming & Scripting
Home Forums Register Search Today's Posts Mark Forums Read

Analysing text files to obtain statistics on their content

Post New ThreadReply
 
Thread Tools Display Modes
  #1  
Old 23-Jun-2008, 11:39 AM
Davo1977 Davo1977 is offline
New Member
Posts: 5
Points: 0 Davo1977 has no points
Power: 1
None
Join Date: 23 Jun 2008
Location: BEXHILL
WIP: CIW Website Design Manager
Analysing text files to obtain statistics on their content

perl assignment

I need to analyse a text file to obtain statistics on its contents? It should check if an argument has been provided and if not, it should prompt for, and accept input of a filename from the keyboard. This filename should also be checked to ensure it is in MS-DOS format and should be no longer than 8 characters. The file extension should be optional but if it is given then it should be .txt, either upper or lowercase. If an extension isn't given, .TXT should be added to the end of the filename. If the filename provided isn't the right format the program should display a suitable error message and end at this point. It should then check to see if the file exists using the filename given. If it doesn't then the error message should again be displayed then ended again. If the file does exist but it's empty the error message again to be displayed then ended. If the file exists and contains words or characters etc. it should be read and checked to display crude statistics on the number of characters, words, lines, sentences and paragraphs that are within the file.

I simply am struggling with this assignment as Course study notes are simply not good enough to help me.

Thompsonelvis@aol.com

 
Reply With Quote
  #2  
Old 23-Jun-2008, 06:05 PM
Maruchino Maruchino is offline
New Member
Posts: 22
Points: 0 Maruchino has no points
Power: 2
None
Join Date: 02 Feb 2008
Location: UK
WIP: A+
Not one to bash your post.. but what are you asking? You simply state you cannot do the project and supply your email address - do you expect someone to complete it and email it to you?

Think man think..

 
Reply With Quote
  #3  
Old 23-Jun-2008, 06:48 PM
hbroomhall hbroomhall is offline
Premium Member
Posts: 6,281
Points: 2130 hbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 points
Power: 89
None
Join Date: 08 Sep 2005
Location: Tunbridge Wells, Kent
Certifications: ECDL A+ Network+ i-Net+
WIP: Server+
I replied to this earlier - but somehow the reply has been lost in a black hole! (Glitches on CF?)

First - welcome to CF!

Second - not a good idea to put your email address in a posting, unless you *want* that mailbox to be full of spam!

On to the main question. I think you need to let us see how far you have got with it. I'm not going to give you 'the answer' but I will point out where you may be going wrong.

The golden rule for programming is to break the problem down into pieces. Code up those pieces, and then you will know how to deal with the problem as a whole. If you are going to be doing Web programming you *have* to be able to do this!

Harry.

 
Reply With Quote
  #4  
Old 24-Jun-2008, 04:05 PM
Davo1977 Davo1977 is offline
New Member
Posts: 5
Points: 0 Davo1977 has no points
Power: 1
None
Join Date: 23 Jun 2008
Location: BEXHILL
WIP: CIW Website Design Manager
I am very new to Perl and have managed to compile this code using examples from various books. Could anyone oversee this coding and see how it could be improved.

#!/usr/bin/perl

use strict;
use warnings;

if ($#ARGV == -1) #no filename provided as a command line argument.
{
print("Please enter a filename: ");
$filename = <STDIN>;
chomp($filename);
}
else #got a filename as an argument.
{
$filename = $ARGV[0];
}

#perform the specified checks
#check if filename is valid, exit if not
if ($filename !~ m^/[a-z]{1,7}\.TXT$/i)
{
die("File format not valid\n");)
}

if ($filename !~ m/\.TXT$/i)
{
$filename .= ".TXT";
}

#check if filename is actual file, exit if it is.
if (-e $filename)
{
die("File does not exist\n");
}

#check if filename is empty, exit if it is.
if (-s $filename)
{
die("File is empty\n");
}

my $i = 0;
my $p = 1;
my $words = 0;
my $chars = 0;

open(READFILE, "<$data1.txt") or die "Can't open file '$filename: $!";

#then use a while loop and series of if statements similar to the following
while (<READFILE>) {
chomp; #removes the input record Separator
$i = $.; #"$". is the input record line numbers, $i++ will also work
$p++ if (m/^$/); #count paragraphs
$my @t = split (/\s+/); #split sentences into "words"
$words += @t; #add count to $words
$chars += tr/ //c; #tr/ //c count all characters except spaces and add to $chars
}


#display results
print "There are $i lines in $data1\n";
print "There are $p Paragraphs in $data1\n";
print "There are $words in $data1\n";
print "There are $chars in $data1\n";

close(READFILE);


Last edited by Davo1977 : 25-Jun-2008 at 01:53 PM.
 
Reply With Quote
  #5  
Old 24-Jun-2008, 06:53 PM
hbroomhall hbroomhall is offline
Premium Member
Posts: 6,281
Points: 2130 hbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 pointshbroomhall has over 2000 points
Power: 89
None
Join Date: 08 Sep 2005
Location: Tunbridge Wells, Kent
Certifications: ECDL A+ Network+ i-Net+
WIP: Server+
Quote:
Originally Posted by Davo1977 View Post
This is what I have done so far on the following subject could anybody ellaborate on it please.
Well - it seems a fairly good start. However, the way it reads suggests that it has been copied from two (at least) different places. Do you actually understand the code? Because if you do some of the changes should be obvious.

I'll comment on some of the things I spotted. That isn't to say that there may not be other points:
Quote:
Originally Posted by Davo1977 View Post
#check if filename is valid, exit if not
if ($filename !~ m/[a-zA-Z]{0,7}\.TXT$/i)
If you are using /i then the a-zA-Z is redundant - just use a-z. (Strictly you should use [:alpha:] instead of [a-z], but I wouldn't ding a beginner for that!)
You are insisting on the .TXT being there - not allowing it to be optional.
You aren't anchoring the match to the beginning of the filename (use ^).
You allow a zero length part before the . - I would say that was a grey area - I'd insist on at least one character myself.
Quote:
Originally Posted by Davo1977 View Post
my $file = "data1.txt";
You have got your filename above - why introduce something else?
Quote:
Originally Posted by Davo1977 View Post
open(READFILE, "<$data1.txt") or die "Can't open file '$data1.txt': $!";
This is muddled. data1.txt was a literal filename - not a variable. If you wanted to use a variable it would be $file (or $filename from higher up).
Quote:
Originally Posted by Davo1977 View Post
#then use a while loop and series of if statements similar to the following
while (<READFILE>) {
chomp; #removes the input record Separator
$i = $.; #"$". is the input record line numbers, $i++ will also work
$p++ if (m/^$/); #count paragraphs
$my @t = split (/\s+/); #split sentences into "words" + store them in @t
You can't use my like this! It isn't a variable. And you use my on a variable just once, but the loop here will try and reuse it. Take the 'my @t' out of the loop.
Quote:
Originally Posted by Davo1977 View Post
$words += @t; #count all characters except spaces and add to $chars
The comment doesn't match the code. The code is counting words.


Harry.

 
Reply With Quote
Post New ThreadReply Spread this thread: Submit this thread to digg Submit this thread to del.icio.us


Go Back   CertForums > Computing Support Forums > Programming & Scripting


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Host bridged connection : Unable to obtain IP automatically by DHCP server and intern adeelmpk Virtual Computing 4 22-Apr-2008 11:30 PM
Help with offline files Danmurph MCDST 1 18-Apr-2008 04:23 PM
Need help with script to delete files nugget Programming & Scripting 3 28-Feb-2008 02:12 PM
Microsoft's Windows Home Server corrupts files Mitzs Networking 9 30-Dec-2007 11:25 PM
command line interface anyone? robbo1962 A+ 11 16-Oct-2006 07:01 PM


All times are GMT. The time now is 10:07 PM.

Powered by vBulletin® Version 3.6.11
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
CertForums.co.uk (C) copyright 2003-2007 All Rights Reserved. Content published on CertForums.co.uk requires permission for reprint.
Hosted by Lunarpages