Cleaning data using * not working...

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Cleaning data using * not working...

by ltek on Thu Sep 22, 2005 1:01 pm

I have read over the help file and looked at example macros... I think I'm doing it correctly but it is not working for me.

I need tried to "clean" data from a CSV (which IM is building) but even when using * before the variable {{!COL5}} the result still has %20 in front of the text I need.

Here's an example of my use...

URL GOTO=http://info.site.com/info.asp?PubID=*{{!COL5}}

thanks for the help!
ltek
 
Posts: 45
Joined: Thu Sep 22, 2005 12:52 pm

by Tech Support on Thu Sep 22, 2005 3:55 pm

The "*" function (wildcard) is used in comparions. If used inside the TAG command as in http://www.iopus.com/iim/help/faq_session_id.htm , it can make your macro command insensitive to " " (blanks) in the web page HTML. But it does not remove blanks from strings that are stored _inside_ a variable.

If you use VB or VBS, there is a "trim" function that does what you need.

Or, if you create or edit the input file manually, you can make a "Search & Replace" and replace all " " by "".
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm

by ltek on Thu Sep 22, 2005 4:14 pm

I got this from your demo macro 'Demo-Loop-Csv-2-Web' ...

'
'Note * is used to ignore leading and trailing blanks that could be in the input data
'
'The precent (%) symbol is used to select the stateid by VALUE as defined in the website select statement and not by its index.
TAG POS=1 TYPE=SELECT FORM=NAME:WebDataEntry ATTR=NAME:STATEID CONTENT=$*{{!COL6}}*
'
'The string ($) symbol is used to select the country by TEXT, not by its index.
'Index would be the position of an entry in the combo box list, e. g. 161 for United States
TAG POS=1 TYPE=SELECT FORM=NAME:WebDataEntry ATTR=NAME:COUNTRYID CONTENT=$*{{!COL7}}*


So is this wrong?

Is there not a function to remove leading/trailing spaces as this macro states?
ltek
 
Posts: 45
Joined: Thu Sep 22, 2005 12:52 pm

by Tech Support on Thu Sep 22, 2005 4:28 pm

Sorry, the comment in the macro is not correct. It should read:

'Note * is used to ignore leading and trailing blanks that could be in the web page

Example:

If the website has the following html code:

<select name="color">
<option> red </option>
<option>green </option>
<option>yellow</option>
</select>

then you can select "red" by using
CONTENT=$<SP>red<SP>:
or (better)
CONTENT=$*red*

This is recommended as website html often contains these kind of "random" spaces. Web designer do not pay much attention to them but for automating websites it can be important :wink:
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm

by ltek on Thu Sep 22, 2005 8:12 pm

you have a few macros that have that same statement... very confusing. I spent some serious time trying to figure out what I was doing wrong.

I'm finding that most of what I need to do must be done manually in script... it seems IM is mainly an "extension" to WSH for doing anything more then simple tasks.

IM seems nice for basic web page automation and form/table scraping or static data entry but other more advanced and/or dynamic things require scripting in vb/js/etc.

questions...

only 3 variables in the macros? This is VERY limiting, why? It forces me to create several macros and link them instead of one large one. I don't see the point.

why not allow the script right in the macro itself? like ASP where you designate what type of lang with the file itself
ltek
 
Posts: 45
Joined: Thu Sep 22, 2005 12:52 pm

by Tech Support on Fri Sep 23, 2005 6:52 am

1. The separation between scripting and the macros is by design. You can use ANY windows programming to interface with the IIM browser. Thus you do not have to learn a new proprietary language, but can use you well-known languages such as VBS, VB6, VB.NET or C#.

2. Using the Scripting Interface and the iimSet command, you have up to 100 variables. So far, no user ever reached this limit :D

3. The option to include script code inside of macros is something that we plan for future releases.
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 3 guests

-->