EvilZone

Programming and Scripting => Projects and Discussion => Topic started by: Huntondoom on November 01, 2011, 09:45:39 PM

Title: Post Grabber
Post by: Huntondoom on November 01, 2011, 09:45:39 PM
for a project Im working on right now I need to make a Post grabber, it needs to get the contents of a post (with username etc) in HTML

so far I have this:
Code: (C#) [Select]
string[] Posts = new string[0];
            string Start = "<div class=\"windowbg\">".ToLower();

            for (int S = 0; S < page.Length - Start.Length; ++S)
            {
                string part = page.Substring(S, Start.Length).ToLower();
                if (part==Start)
                {
                    int A = 0;
                    for (int E = S; E < page.Length - 2; ++E)
                    {
                        part = page.Substring(E, 2);
                        if (part.StartsWith("<") & !part.EndsWith("/")) {++A; }
                        if (part == "</") { --A; }
                        if (part == "</" & A == 0)
                        {
                            Posts = AddToArray(page.Substring(S, E - S), Posts);
                            S = E;
                            break;
                        }
                    }
                }
            }
but is doesn't always give me the post or sometimes not enough
Title: Re: Post Grabber
Post by: Kulverstukas on November 01, 2011, 10:25:52 PM
Consider using Regex. Much more elegant and probably faster.
And is that Java?
Title: Re: Post Grabber
Post by: Huntondoom on November 01, 2011, 11:25:39 PM
Consider using Regex. Much more elegant and probably faster.
And is that Java?
Visual C# which would be the mircosoft .NET version of java
Title: Re: Post Grabber
Post by: Huntondoom on November 02, 2011, 03:01:33 PM
but I dont want to be depend on Regex, cause its a suprise and I want it to work on everyones computer (everyone with Windows) without having them to upgrade their .net version
Title: Re: Post Grabber
Post by: ande on November 02, 2011, 03:13:32 PM
but I dont want to be depend on Regex, cause its a suprise and I want it to work on everyones computer (everyone with Windows) without having them to upgrade their .net version

Isn't regex a part of .net framework 2.0?
Title: Re: Post Grabber
Post by: Huntondoom on November 02, 2011, 03:40:08 PM
Isn't regex a part of .net framework 2.0?
dont know, I never use Regex
Title: Re: Post Grabber
Post by: Kulverstukas on November 02, 2011, 03:45:01 PM
dont know, I never use Regex
Well learn it. I did and now I couldn't live without it!
Title: Re: Post Grabber
Post by: ande on November 02, 2011, 03:47:33 PM
dont know, I never use Regex


Okay, found out. Infact, its a part of the 4.0 framework... Which sucks.
Title: Re: Post Grabber
Post by: Huntondoom on November 02, 2011, 07:29:02 PM
I made this so far, it now returns almost every post
Code: (C#) [Select]
string[] Posts = new string[0];
            string Start = "<div class=\"windowbg\">".ToLower();
            string End = "<hr".ToLower();

            for (int S = 0; S < page.Length - Start.Length; ++S)
            {
                string part = page.Substring(S, Start.Length).ToLower();
                if (part==Start)
                {
                    for (int E = S; E < page.Length - End.Length; ++E)
                    {
                        part = page.Substring(E, End.Length).ToLower();
                        if (part==End)
                        {
                            Posts = AddToArray(page.Substring(S, E - S), Posts);
                            S = E;
                            break;
                        }
                    }
                }
            }
            return Posts;
Title: Re: Post Grabber
Post by: Kulverstukas on November 02, 2011, 07:41:03 PM
Kinda fucked up to read :D
When I was doing a client for an online dictionary, I didn't know regex at that time. If you want I can show you the extracting part. It's in Delphi though, but you may still get the algorithm idea.
Title: Re: Post Grabber
Post by: Huntondoom on November 02, 2011, 08:34:46 PM
Kinda fucked up to read :D
When I was doing a client for an online dictionary, I didn't know regex at that time. If you want I can show you the extracting part. It's in Delphi though, but you may still get the algorithm idea.
sure but regex is .net 4.0 so im not going to use it :S
Title: Re: Post Grabber
Post by: Kulverstukas on November 02, 2011, 09:16:10 PM
sure but regex is .net 4.0 so im not going to use it :S
I said it doesn't use Regex.

Here is the code from few years ago :D

I used this to get an explanation for the word.
Code: [Select]
function pavyzdys(yKur : ansistring) : ansistring;
 var Pavyzdys : Record
                 Start,Endd : integer;
                end;
     i,m : integer;
begin
//====
 result := '';
//====
   if AnsiContainsText(yKur,'<p class=''pavyzdys''>') then
    begin
     Pavyzdys.Start   := AnsiPos('<p class=''pavyzdys''>',yKur);
     i := Pavyzdys.Start;
     m := 0;
      repeat
       Inc(i);
       m := m + 1
      until (AnsiPos(Copy(yKur,Pavyzdys.Start,m),'</p>') <> 0);
    end;
//====
  result := Copy(yKur,Pavyzdys.Start,m);
end;