Homesource Forums

Homeworld Source Editing Talk
It is currently Sun Sep 24, 2017 9:21 pm

All times are UTC - 5 hours




Post new topic Reply to topic  [ 55 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Tue Oct 21, 2008 1:42 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
After looking things over, I have produced a few experimental Universal binaries, and it appears that if i introduce a macro called something like _MACOSX_86, (which would be defined in the xcode project file) and change a few #ifdef statements to include "!defined (__ppc__)" so that os x ppc code is not directed to the x86 side of the binary. I will document most things here, so that everyone can see exactly what I am doing and provide input, so that when it comes time to commit, I dont break anything.

any and all comments are appreciated.


Top
 Profile  
 
PostPosted: Tue Oct 21, 2008 9:02 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I replaced a line
"#ifdef _MACOSX_FIX_ME"
with
"#if defined (_MACOSX_FIX_ME) && defined (__ppc__)"

This cleans up about 17 compile errors, because the former line was directing the i386 mac osx compiler through ppc assembly assembly code. This way the ppc side of things uses the ppc assembly and the i386 side of things uses the x86 code.

added:
#ifdef __i386__
#define _MACOSX_86
#endif
to the Homeworld_prefix.h file in order to toggle code that was intended for the __i386__ world.

modifiied the line
"#elif defined (__GNUC__) && defined (__i386__)"
to say:
"#elif defined (__GNUC__) && defined (__i386__) && !defined (_MACOSX_86)"

to toggle two blocks of non-PIC assembly language in Matrix.c and Transformer.c

The Game now builds but crashes when starting a single player game.


Top
 Profile  
 
PostPosted: Tue Oct 21, 2008 11:44 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Training works fine, but when start single player I crash at the end of the loading screen.

This tidbit from Particle.c is causing me problems:
Code:
real32 partRealDist(real32 n, real32 d)
{
    real32 r = (real32)((real64)(ranRandom(RANDOM_PARTICLE_STREAM) % 1000000) / 1000000.0);
    real32 sign = (ranRandom(RANDOM_PARTICLE_STREAM) % 2 == 0) ? -1.0f : 1.0f;
    if (d < 0.0f)
        sign = -1.0f;
    return n + sign*r*d;
}

im getting 'Program received signal: “EXC_BAD_ACCESS”.'
at:
Code:
    real32 r = (real32)((real64)(ranRandom(RANDOM_PARTICLE_STREAM) % 1000000) / 1000000.0);


This only happens when building for intel, building for ppc runs fine, I'm gonna go brain this for a while


Last edited by Axcess on Thu Nov 06, 2008 2:54 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Wed Oct 22, 2008 12:36 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
it would appear that the casting of a udword into a real32 or a real64 creates a problem at runtime, I am unsure why this is happening, however it appears to be unnecessary , because the function returns a udword and then this udword becomes r after some math and is used as a multiplier.

code that fixes this crash:
Code:
real32 partRealDist(real32 n, real32 d)
{
   udword test1 = ranRandom(RANDOM_PARTICLE_STREAM);
   udword test2 = (test1 % 1000000);
    udword r = (test2 / 1000000);
    real32 sign = (ranRandom(RANDOM_PARTICLE_STREAM) % 2 == 0) ? -1.0f : 1.0f;
    if (d < 0.0f)
        sign = -1.0f;
    return n + sign*r*d;
}


Now the game starts and runs, I still have no idea how stable it is, but as of right now I can start and quit the game without issue.


Last edited by Axcess on Thu Nov 06, 2008 2:55 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Wed Oct 22, 2008 11:49 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Just for kicks I decided to turn on HW_BUILD_FOR_DEBUGGING.

A similar error to earlier is returned by this code:

Code:
    while (etgExecStack.etgCodeBlock[etgExecStack.etgCodeBlockIndex].offset <
           etgExecStack.etgCodeBlock[etgExecStack.etgCodeBlockIndex].length)
    {
        pOpcode = etgExecStack.etgCodeBlock[etgExecStack.etgCodeBlockIndex].code + etgExecStack.etgCodeBlock[etgExecStack.etgCodeBlockIndex].offset;
        opcode = *((udword *)pOpcode);                      //get an opcode
#if ETG_ERROR_CHECKING
        if (opcode < 0 || opcode >= EOP_LastOp)
        {
            dbgFatalf(DBG_Loc, "Effect '%s' has a bad opcode in code segment %d offset %d", stat->name, etgExecStack.etgCodeBlock[etgExecStack.etgCodeBlockIndex].code, etgExecStack.etgCodeBlock[etgExecStack.etgCodeBlockIndex].offset);
        }
        if (etgHandleTable[opcode].function == NULL)
        {
            dbgFatalf(DBG_Loc, "etgEffectCodeExecute: NULL opcode %d", opcode);
        }
#endif
        //execute the opcode
        size = etgHandleTable[opcode].function(effect, stat, pOpcode);
        etgExecStack.etgCodeBlock[etgExecStack.etgCodeBlockIndex].offset += size;
    }


the problematic line is:

Code:
etgExecStack.etgCodeBlock[etgExecStack.etgCodeBlockIndex].offset += size;


now I will go brain this one.

edit: offset is a udword that is equal to 20, and size is an sdword equal to 44

edit: I inserted this line and it does not work either.
Code:
udword test1 = etgExecStack.etgCodeBlockIndex;


Last edited by Axcess on Thu Oct 23, 2008 3:37 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Wed Oct 22, 2008 5:03 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I will shelf the above bug for the moment, as it only applies when building for debugging.

homeworld_prefix.h now looks like this:
Code:
#ifdef _MACOSX
    #ifndef _MACOSX_FIX_ME
        #define _MACOSX_FIX_ME 1
    #endif
   #ifdef __ppc__
      #define _MACOSX_PPC
   #endif
   #ifdef __i386__
      #define _MACOSX_86
   #endif
#endif


My hope is that this will enable me to toggle sound code accordingly, so that while sound is not possible on the ppc side of homeworld, the x86 side might work well with the linux code.


Last edited by Axcess on Thu Oct 23, 2008 3:35 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Thu Oct 23, 2008 6:58 am 
Offline
coder

Joined: Tue Nov 07, 2006 4:40 am
Posts: 236
Hi.

If you're putting chunks of code into the forum, can you wrap it with a code block, and a hint about where it's come from? :)

Code:
etgExecStack.etgCodeBlock[etgExecStack.etgCodeBlockIndex].offset += size;


It seems odd to me (but not unique) that the error comes from outside the code enabled by the define.

What's the opcode defined as, and where in memory is etgExecStack?

Aunxx.


Top
 Profile  
 
PostPosted: Thu Oct 23, 2008 3:36 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
ah, yes, I was kind of wondering about that, ill fix that


Top
 Profile  
 
PostPosted: Thu Oct 23, 2008 4:02 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Ok, I am not sure if I did this right, but i dropped in the line:
Code:
ubyte *adr = &etgExecStack;

right before the line that fails, and my debugger reports that the address stored in adr* is 0x0. now even i know this is a problem, (unless i did it wrong) but this causes me to have a lot of questions. c would be a second programming language to me, I can read it but im not so hot at writing it, I am normally a java programmer. and java is pretty much immune to memory problems like this, which means I have no idea how this happened or what the best way to fix it is.

I assume that I should grab the address way back when it was declared, and then do something with that.

edit: oh, and opcode is a udword, which makes the first part of the above if statement pointless


Top
 Profile  
 
PostPosted: Thu Oct 23, 2008 5:30 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I think ive clobbered this one, After running a line by line debug, I found that it is a bit of asm that is called:

Code:
#elif defined (__GNUC__) && defined (__i386__)
        __asm__ __volatile__ (                              /* push it onto the stack */
            "pushl %0\n\t"
            :
            : "a" (param) );
#endif


Code:
#elif defined (__GNUC__) && defined (__i386__)
        __asm__ __volatile__ (                              /* pass a 'this' pointer */
            "pushl %0\n\t"
            :
            : "a" (effect) );
#endif


I dont think these two are doing what they are supposed to be doing.... my first question is, does pushl involve ebx?


Top
 Profile  
 
PostPosted: Thu Oct 23, 2008 11:06 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I modified the ifdef of this assembly to exclude os x, and this addressing problem went away, however as soon as it was gone, another one showed up to take it's place, this time in a pointer called effect, I will go back to dealing with sound issues for the time being.

I am trying to figure out how to get xcode to link .o files to .h files, because i believe since I am running an x86 machine, I should be able to link those files and then remove some ifdefs in the sound engine source files.... and then maybe, it will just work.....maybe........


Top
 Profile  
 
PostPosted: Fri Oct 24, 2008 12:58 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
it has come to my attention that _MACOSX is directly associated with FIX_ENDIAN....... im not sure about this, but i think intel macs are little endian when ppc macs are big endian..... which means UGH


I realized after bashing my head into the .o files, that the reverse engineered .c files are in my src directory but not in my project file..... this project file is old. I added and linked the appropriate files, and looked through the os x code files to remove the ifdef code toggles. everything appears to be working ok, other than the fact that I have no sound.... at all.... not on ppc or i386.... I must have missed one someplace.


Top
 Profile  
 
PostPosted: Fri Oct 24, 2008 3:29 am 
Offline
coder

Joined: Tue Nov 07, 2006 4:40 am
Posts: 236
Mornin'

I had the same problem with the address of etgExecStack so I wondered if you had the same thing. Cannot immediately remember what caused it though, but I'll try and remember.

Endian -- Yes. Enjoy.

pushl shouldn't involve ebx, but it perturbs the stack. There should be an associated pop somewhere down the line or it can cause a lot of problems.

Do you use a debugger? It makes life a lot easier. :)

Sound should work in x86 provided the SDL sound is working. There also shouldn't be any problem in getting the ppc sound to work now that we don't statically link to the old relic objects.

Aunxx.


Top
 Profile  
 
PostPosted: Fri Oct 24, 2008 8:51 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Ill have to look at that loop again and see where it is supposed to pop the stack... because that appears to be the problem.

As for a debugger, I use the one that comes with xcode, which I am generally happy with, other than the fact that It does not give me the addresses of anything other than pointers, hence the line i inserted above.

the funny thing is, my debugger does not loose track of the memory address, so it acts like everything is there, fine and dandy....


Top
 Profile  
 
PostPosted: Fri Oct 24, 2008 11:19 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
it would appear that the first chunk of assembly above is run 3 times before the crash, the second chunk is not run at all before the crash. If i comment out the first block the crash does not happen, If you can not tell, I am learning c as I go along, so I am still getting used to reading pointer syntax and other things like that.... I dont think this is an endian issue, I have large blocks of assembly in transform and matrix commented out.. but I dont think those have anything to do with this.

Having learned on java, how an instance of a variable or a struct can loose its memory address baffles me...how does this happen? and the concept of a global stack also baffles me..... why do we need to use this stack directly? isnt there a way to do this without using assembly?


Top
 Profile  
 
PostPosted: Fri Oct 24, 2008 9:11 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
well after focusing on sound a while longer i have discovered my problem....im an idiot

Code:
//#ifdef _MACOSX_FIX_ME
   return SOUND_ERR;
//#endif

This is not helpful....

but now that this is out of the way, sound works. Which is a first for me. I believe lmop was making headway on the ppc side of things. I am not really concerned with that right now ppc for me does not work, and since he was allready working on it I will wait for him to commit his work. i can disable the sound engine pretty easily by simply replacing the above code with this.
Code:
#ifdef __ppc__
   return SOUND_ERR;
#endif


now however, I have the problem of homeworld freezing when quitting... I suspect an infinite loop someplace


Top
 Profile  
 
PostPosted: Fri Oct 24, 2008 9:29 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
found an infinite loop.

Code:
         while (!(mixer.status == SOUND_STOPPED))
         {
            musicEventUpdateVolume();
            SDL_Delay(0);
         }


after tracing it I found that SOUND_ERR is declared several times, so I am going to test for SOUND_ERR, and break if the condition exists. because even if i fix this problem, an infinite loop when the sound engine does not want to stop seems like a problem to me.


Top
 Profile  
 
PostPosted: Fri Oct 24, 2008 10:31 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Hmm testing for SOUND ERR is harder than it seemed, I have the suspicion that many of my problems are coming from _MACOSX_FIX_ME... the problem is that many of the code toggles may be ppc or little endian specific, I will examine each one, one by one and replace them with either _MACOSX_PPC or _MACOSX_86, based on the situation. perhaps I should also declare a FIX ME for each of those, but that might be too many macros..... either way, I feel that less "_MACOSX_FIX_ME would be a good thing.

edit: I just wanted to share this:

Code:
#ifdef _MACOSX_FIX_ME
    // requires tga.h or equivalent
    #define GLFONT_OUTPUT_TARGAS 0
#else
    //output targas for the glfont pages if !0
    #define GLFONT_OUTPUT_TARGAS 0
#endif


Top
 Profile  
 
PostPosted: Fri Oct 24, 2008 11:39 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
ah, here we go, I think this is the cause of some of my problems.
Code:
#ifdef _MACOSX_FIX_ME // FIX_ENDIAN?
   FlagBit = 1 << (31 - Index & 31);
#else
   FlagBit = 1 << (Index & 31);
#endif



It would be really nice if i could commit right about now, because I am almost done Depricating _MACOSX_FIX_ME


Top
 Profile  
 
PostPosted: Sat Oct 25, 2008 2:18 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
well other than the sound engine shutdown issue at quit time, things are running pretty well. I just played through the first level and reasonably far into the second. I didnt have any issues.

Playing a Single player game after building for debugging still does not work, I am still unable to figure out the stack problem above.


Top
 Profile  
 
PostPosted: Mon Oct 27, 2008 3:19 pm 
Offline
coder

Joined: Tue Nov 07, 2006 4:40 am
Posts: 236
Hi.

Two things occur to me to ask.

1) What version of the codebase are you working from?

2) Are you using your own SVN server? I must admit that when I was working on the x86_64 code I found it invaluable to be able to see what I'd changed and be able to revert things if I'd got them very wrong. (Which I did on a couple of occasions. :) )

Aunxx


Top
 Profile  
 
PostPosted: Mon Oct 27, 2008 5:34 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I have my own svn server, but it really didnt occur to me to use it until now. ( I am interning as a sys admin at my school right now, and ive been doing linux/os x administration as a hobby for a few years now)

I should be using the latest source code from the homeworld svn server, (r674 i believe), I have an account capable of checkout.. I dont thing i can commit, however I may not have xcode set up properly.

I dont really understand why that things is so sacred, I mean svn can revert anything you commit right? so why is committing such a big deal?... But I have to earn people's trust before I can commit anything to the homesource server, then I'll do my best to do that.


Top
 Profile  
 
PostPosted: Mon Oct 27, 2008 5:50 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
but then again, perhaps lmop did give me commit privileges, and I just have a problem on my end, because i cant commit to my own server either.


Top
 Profile  
 
PostPosted: Wed Oct 29, 2008 9:47 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I have been working on converting all the x86 assembly to fPIC compliant. I believe that this is the source of some of my memory problems. Registers get clobbered, and xcode is more sensitive to it because of the os x environment requires the -fPIC flag on gcc. once I do this I think it will be better for the linux platforms as well, because I have been reading about PIC compliant code, and it is supposed to optimize better. I am going to start with Matrix.c and Transform.c and see if that marks any improvements in performance or any less crashes. If you see any problems with this please point them out.

Here is a block of assembly I allready did from Matrix.c
Code:
#elif defined (__GNUC__) && defined (__i386__)
    __asm__ __volatile__(
      "    pushl   %%edi\n"
      "    pushl   %%esi\n"
      "    pushl   %%ebx\n"
                  
      "    movl    %0, %%esi\n"
      "    movl    %1, %%ebx\n"
      "    movl    %2, %%edi\n"
                   
        "    flds    0*"FSIZE_STR"("SOURCE")\n"               /*s0*/
        "    fmuls   (0+0*3)*"FSIZE_STR"("MATRIX")\n"         /*a0*/
        "    flds    1*"FSIZE_STR"("SOURCE")\n"               /*s1 a0*/
        "    fmuls   (1+0*3)*"FSIZE_STR"("MATRIX")\n"         /*a1 a0*/
        "    flds    2*"FSIZE_STR"("SOURCE")\n"               /*s2 a1 a0*/
        "    fmuls   (2+0*3)*"FSIZE_STR"("MATRIX")\n"         /*a2 a1 a0*/

        "    fxch    %%st(1)\n"                               /*a1 a2 a0*/
        "    faddp   %%st, %%st(2)\n"                         /*a2 a1+a0*/

        "    flds    0*"FSIZE_STR"("SOURCE")\n"               /*s0 a2 a1+a0*/
        "    fmuls   (0+1*3)*"FSIZE_STR"("MATRIX")\n"         /*b0 a2 a1+a0*/

        "    fxch    %%st(1)\n"                               /*a2 b0 a1+a0*/
        "    faddp   %%st, %%st(2)\n"                         /*b0 d0*/

        "    flds    1*"FSIZE_STR"("SOURCE")\n"               /*s1 b0 d0*/
        "    fmuls   (1+1*3)*"FSIZE_STR"("MATRIX")\n"         /*b1 b0 d0*/
        "    flds    2*"FSIZE_STR"("SOURCE")\n"               /*s2 b1 b0 d0*/
        "    fmuls   (2+1*3)*"FSIZE_STR"("MATRIX")\n"         /*b2 b1 b0 d0*/

        "    fxch    %%st(1)\n"                               /*b1 b2 b0 d0*/
        "    faddp   %%st, %%st(1)\n"                         /*b1+b2 b0 d0*/

        "    flds    0*"FSIZE_STR"("SOURCE")\n"               /*s0 b1+b2 b0 d0*/
        "    fmuls   (0+2*3)*"FSIZE_STR"("MATRIX")\n"         /*c0 b1+b2 b0 d0*/

        "    fxch    %%st(1)\n"                               /*b1+b2 c0 b0 d0*/
        "    faddp   %%st, %%st(2)\n"                         /*c0 d1 d0*/

        "    flds    1*"FSIZE_STR"("SOURCE")\n"               /*s1 c0 d1 d0*/
        "    fmuls   (1+2*3)*"FSIZE_STR"("MATRIX")\n"         /*c1 c0 d1 d0*/
        "    flds    2*"FSIZE_STR"("SOURCE")\n"               /*s2 c1 c0 d1 d0*/
        "    fmuls   (2+2*3)*"FSIZE_STR"("MATRIX")\n"         /*c2 c1 c0 d1 d0*/

        "    fxch    %%st(1)\n"                               /*c1 c2 c0 d1 d0*/
        "    faddp   %%st, %%st(1)\n"                         /*c1+c2 c0 d1 d0*/

        "    fxch    %%st(2)\n"                               /*d1 c0 c1+c2 d0*/
        "    fstps   1*"FSIZE_STR"("DEST")\n"                 /*c0 c1+c2 d0*/
        "    fxch    %%st(1)\n"                               /*c1+c2 c0 d0*/
        "    faddp   %%st, %%st(1)\n"                         /*d2 d0*/
        "    fxch    %%st(1)\n"                               /*d0 d2*/
        "    fstps   0*"FSIZE_STR"("DEST")\n"                 /*d2*/
        "    fstps   2*"FSIZE_STR"("DEST")\n"
                   
      "    popl    %%ebx\n"
      "    popl    %%esi\n"
      "    popl    %%edi\n"
        :
        : "r" (vector), "r" (result), "r" (matrix)
       );// Sireg "S"    breg "b"      Direg "D"


This would not even compile before, and now it runs just fine.

Edit: I also believe that this code would allow removal of the __volatile__ tag, which would also allow better optimization... but I will worry about that later.


Top
 Profile  
 
PostPosted: Thu Oct 30, 2008 2:02 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Seems to be successful, I have gone through all x86 assembly I could find that makes use of registers and made sure the registers were not clobbered... I ran through the first level a few times and did not see any crashes due to random memory errors. I could just be lucky, but more testing will follow.


Top
 Profile  
 
PostPosted: Thu Oct 30, 2008 7:27 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I have gone back to the etgFunctionCall I have come to the conclusion that there is nothing wrong with the ETG itself, but with something from build for debugging... considerign the problem only comes about when I build for debugging. I have been running through various parts of homeworld and turning various blocks of debugging code on and off, to see if it makes a difference, so far no luck.

the question is, What block of debugging code is run only before single player (not before training), and messes with the input for etgFunctionCall?


Top
 Profile  
 
PostPosted: Fri Oct 31, 2008 2:52 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
FOUND IT!
Code:
#if DBG_ASSERT
    #define dbgAssertOrIgnore(expr) if (!(expr)) { dbgFatalf(DBG_Loc, "Assertion of (%s) failed.", #expr); }
    #define dbgAssertAlwaysDo(expr) dbgAssertOrIgnore(expr)
#else
    #define dbgAssertOrIgnore(expr) ((void)0)
    #define dbgAssertAlwaysDo(expr) (expr)
#endif


The problem is in the definition of this macro, when DBG_ASSERT is defined as 1, the game crashes due to problems with etgFunctionCall, when it is defined as 0, the game runs fine. I tested further by commenting out the if/else blocks, so that else is the only option, and problem solved. Now I need to take a look at places where this macro is called....


Top
 Profile  
 
PostPosted: Sat Nov 01, 2008 8:10 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
well i found the offending statement.... but its weirder than I thought....
Code:
   evt = ((etgfunctioncall *)opcode)->parameter[index].type;
        switch (evt)
        {
            case EVT_Constant: //6
      //printf("Constant!");
                break;
            case EVT_Label: //7
                //dbgAssertOrIgnore(FALSE);
                break;
            case EVT_ConstLabel: //8
                param = (udword)stat->constData + param;
                break;
            case EVT_VarLabel: //9
                param = (udword)effect->variable + param;
                break;
            default:
                param = *((udword *)(effect->variable + param));
                break;
        }


I have it commented out... and everything works fine... but I refuse to leave it at that, because this makes zero sense.... for several reasons. first of all, the case "EVT_Label" never presents itself once from runtime to the crash...Also, if I replace that line with something like printf("test"); the same crash happens... it appears as if any line below the "EVT_Label" case causes the game to crash... I found this also applies to "EVT_Constant" if any code exists within these two cases, the game crashes... for what appears to be no legitimate reason... I think some assembly some place is to blame..... osx's requirment of fPIC assembly means no clobbered registers... but I am starting to wonder if there is another requirement that interferes with direct use of the stack.


Top
 Profile  
 
PostPosted: Mon Nov 03, 2008 12:40 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Well I just wasted a lot of time on a dumb problem.... apparently, if the path to homeworld is too long( greater than 128 characters?), the game goes into an infinite loop when starting a single player game because it cannot toLower the path name... and I tried so much stuff to try and fix it....


Top
 Profile  
 
PostPosted: Mon Nov 03, 2008 2:42 am 
Offline
coder

Joined: Tue Nov 07, 2006 4:40 am
Posts: 236
Hi.

There is some weirdness around that set of functions, and it is invariably a problem elsewhere.

What is the address of opcode, and what does the opcode look like?
Code:
*(etgfunctioncall *)opcode


It is difficult to try and help because of just how intricate this file is and how it works. There are a few hints I should be able to give but I found that the only way to really inspect the problem is to get really,really good with a debugger and try and find which parts of the opcode were wrong, and then trace back to where that was set.

Quote:
The GNU debugger (GDB) allows you to step through code, watch values, and monitor execution from the command-line. Xcode uses GDB as the basis for its debugger. If you are using GCC you will want to also learn to use GDB.


So I suspect that your Xcode version be more than adequate, as I used gdb. :)

Aunxx.


Top
 Profile  
 
PostPosted: Mon Nov 03, 2008 11:09 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Hey, thanks for the hints, xcode does have gdb, but I realy have no idea how to use it from a console, however I have been stepping through the code. the biggest problem is that being a java programmer, I really dont have a lot of experience with pointers. I think that in the end my persistence will pay off. for the time being, lmop gave me svn write privileges, and so I will be reverting to the base code, and then doing things over one by one, and doing a series of commits. First i will do all the commits that should only effect the osx side of things. then I cautiously move on to things that may effect other platforms. for example, I will need to determine if the assembly changes will need to be broken down into an os x ifdef, or if linux can deal with them. I am also unsure if some of them are even necessary, so I will be doing them again as needed from scratch.


Top
 Profile  
 
PostPosted: Mon Nov 03, 2008 11:56 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Ok, welcome to the world of half commits.... apparently I have rights to update some files... but not others....

This lack to 4 files should only be a problem for OS X systems... but it would be nice to have the ability to finish the commit...


Top
 Profile  
 
PostPosted: Mon Nov 03, 2008 11:08 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
ok, the rest of it is in, apparently my internet is terrible.


Top
 Profile  
 
PostPosted: Wed Nov 05, 2008 1:01 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I am going to try a new approach. I am going to bypass all assembly (it does not run on the ppc version of homeworld anyway... how important could it be?
Then I am going to bypass etgFunctionCall.
At this point the game runs without effects.
Now I will run around and try to squash all memory problems i can find, and get the game running stable, without effects.
If I can achieve this, I will then reintroduce etgFunctionCall and try to debug it.

Here goes....

The game crashes on quit, due to an error here:
Code:
    int result;
    result = munmap(heap, 0);
    dbgAssertOrIgnore(result != -1);


munmap is returning -1

I looked it up, and 0 is an invalid argument.... checking errno confirms this... the second argument is supposed to be a multiple of the system page size... soooo... what about this?

Code:
    int result;
    long int heapPageSize = sysconf(_SC_PAGE_SIZE);
    printf( "Page Size: %li\n", heapPageSize );
    result = munmap(heap, heapPageSize);
    printf( "Error: %s\n", strerror( errno ) );
    dbgAssertOrIgnore(result != -1);


this would give munmap one page size... but how do i know the heap is one page? I don't... this fixes the crash, but I do not know if this is a real fix or not. is there a way to find out how many pages the heap is?

Also, munmap is used with "0"s in multiple places... does this work on linux? everything i am looking at says it should not.

edit: this is in utility.c


Top
 Profile  
 
PostPosted: Fri Nov 07, 2008 8:28 am 
Offline
coder

Joined: Tue Nov 07, 2006 4:40 am
Posts: 236
It does work in Linux, and I'm having a look now if there's an error.

There are multiple heaps created, at least two but probably more. You can inspect the
Code:
memNumberGrowthPools
value which I believe shows how many are created.

I wouldn't worry about trying to close them all though because the function is called for each heap rather than trying to close them all in one call.

Have a look at :
http://homesource.nekomimicon.net/sourceforum/viewtopic.php?f=4&t=233

The memory functions are all designed around requesting a big block of memory and then dealing with it itself. I would say that the unix based malloc routines are a lot more mature than the old MS code.

Aunxx


Top
 Profile  
 
PostPosted: Sun Nov 09, 2008 3:40 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
Just for the fun of it I am going to compile on my ppc dual 1ghz g4, and see if any of these bugs span across the great divide.


Top
 Profile  
 
PostPosted: Mon Nov 10, 2008 1:38 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
it does not appear as if the ppc world is effected by anything major, but a few minor display bugs. one thing that i was surprised not to see is that the background on x86 appears to come and go, depending on where i position the camera. (the background along with various other things, like health bars, the blue background of the game menus) I think this may be associated with some of my memory probelms/ and some of the gl crashes that keep happening.... but maybe it is because I am running the game on an intel graphics card rather than the geforce 4 in my ppc mac..... where would be a good place to start looking for this? (i know I am ADD with bugs, i just jump from bug to bug)...


Top
 Profile  
 
PostPosted: Tue Nov 11, 2008 4:20 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
the graphic issues described above occur when i run a ppc compiled binary on my intel mac (rosetta emulation built into os x) so they occur on my intel GMA X3100, however they do not occur on the Geforce 4 MX in my powermac g4. I think it is safe to assume this issue is because of the graphics card, I am unsure what to do about it, and I do not know if these problems occur when i run windows on this mac. I will try that later.


Top
 Profile  
 
PostPosted: Tue Jan 06, 2009 9:12 pm 
Offline

Joined: Sun Jan 04, 2009 6:55 pm
Posts: 1
Have these changes been committed to the scm system? Particularly the random number fix that prevents the crash? I checked out the code over the weekend and it wasn't. As far as I can tell that makes the game playable, I haven't had any problems yet.


Top
 Profile  
 
PostPosted: Sun Feb 01, 2009 12:16 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
the Majority of these fixes have not been committed, because I do not know why they work. I am not comfortable making changes to the code unless I know they are fixing the problem, not just working around the symptoms.


Top
 Profile  
 
PostPosted: Sat Mar 07, 2009 3:03 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I'm alive... and I'm going after this with new fervor... i guess...(I am also a bit more knowledgeable with c now) I didn't know things like function pointers existed. I would highly recommend this - http://boredzo.org/pointers/


Top
 Profile  
 
PostPosted: Sat Mar 07, 2009 7:20 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
My new theory is that this has something to dow with function length. because if I add arbitrary statements to functions, it changes the behavior of the crash... for example, if i plop a:
Code:
printf("Baste a Chicken");

into the middle of etgFunctionCall(), it causes it to to crash at a different point. and sometimes if I add several lines of code, it causes it to crash less or not at all... this is why some of my previous fixes appeared to work... but everything was still generally unstable. I'll look into this more later.


Top
 Profile  
 
PostPosted: Sun Mar 08, 2009 9:01 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
well I put down the homeworld code and wrote a program of my own, I tried to duplicate etgFunctionCall's methodology in simpler terms. I have duplicated the bug!

Code:
#include <stdio.h>

int multi (int first, int second);

int main(){
   int rtn = 0;
   
   int (*funcPtr)() = &multi;
   
   printf("before asm");
   asm volatile(
      "movl   $5, %%eax\n"
      "pushl   %%eax\n"
      "movl   $9, %%eax\n"
      "pushl   %%eax\n"
      :
      :
      :"eax");
   
   
   rtn = funcPtr();   
   
   printf("%d\n", rtn);
}

int multi(int first, int second){
   printf("function running");
   return first*second;
}


this runs fine on my machine, but if i comment out the "function running" and the "before asm" printf statements, i get a crash. This is precisely what is happening in homeworld, If i can fix it here, I can fix it there.


Top
 Profile  
 
PostPosted: Mon Mar 09, 2009 12:44 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I think I found a solution, This fixed all bugs in my little test program, now I just have to adapt it to homeworld.

Code:
asm (
                "movl   $5, %%eax\n\t"
                "movl   %%eax, 4(%%esp)\n\t"
                "movl   $9, %%eax\n\t"
                "movl   %%eax, (%%esp)"
                :
                :
                :"eax");


Top
 Profile  
 
PostPosted: Mon Mar 09, 2009 5:12 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
It may be a little early to tell, but I think I might have finally squashed this etgFunctionCall bug. The game appears to be running well.

I made the following changes, and I believe these might even be commit worthy. I will do more testing.
Code:
 int offset;
 int count = 0;
 for (index = (sdword)nParams - 1; index >= 0; index--)
    {               //for each parameter
      count++;
      if (opptr->passThis) {
         offset = nParams*4 - count*4 + 4;
      }
      else {
         offset = nParams*4 - count*4;
      }

Code:
#elif defined (__GNUC__) && defined (__i386__) && !defined (_MACOSX_86)
        __asm__ __volatile__ (                              /* push it onto the stack */
            "pushl %0\n\t"
            :
            : "a" (param) );
#elif defined (_MACOSX_86)
      __asm__ __volatile__ (
         "movl %0, (%%esp,%1)\n\t"
         :
         : "a" (param), "r" (offset) );
#endif

Code:
#elif defined (__GNUC__) && defined (__i386__) && !defined (_MACOSX_86)
        __asm__ __volatile__ (                              /* pass a 'this' pointer */
            "pushl %0\n\t"
            :
            : "a" (effect) );
#elif defined (_MACOSX_86)
      __asm__ __volatile__ (
         "movl %0, (%%esp)\n\t"
         :
         : "a" (effect) );
#endif


i will probably ifdef the if else statement above out of the way, because it is certainly unnecessary for other platforms, One questions I would have is: Does my code for os x work on linux as well?


Top
 Profile  
 
PostPosted: Mon Mar 09, 2009 11:17 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I just committed all changes necessary to turn off all assembly for osx with the exception of the above blocks, which I modified and created an ifdef, so that only os x runs the new assembly. if anyone is bored, they should try out the asm I wrote on linux, I am curious if it could eventually function as a replacement for the pushes and pops.. but then again the pushes and pops don't work on os x. i deliberately did not commit several files because I want to be sure that they are fixed in the best way possible.


Top
 Profile  
 
PostPosted: Tue Mar 10, 2009 3:43 am 
Offline
coder

Joined: Tue Nov 07, 2006 4:40 am
Posts: 236
Hi.

I've tried the code you supplied on Linux (i386) and the first section using pushl compiles, runs then segfaults crashes on exit.
After modifying the code to include the new working bit, that also compiled and ran but exited cleanly.
You can make the fist code section work cleanly on Linux by tidying the stack after the function call.
Code:
#include <stdio.h>

int multi (int first, int second);

int main(){
   int rtn = 0;
   
   int (*funcPtr)() = &multi;
   
   printf("before asm\n");
   asm volatile(
      "movl   $5, %%eax\n"
      "pushl   %%eax\n"
      "movl   $9, %%eax\n"
      "pushl   %%eax\n"
      :
      :
      :"eax");
   
   
   rtn = funcPtr();   
   
   printf("%d\n", rtn);
   asm volatile(
       "popl %%eax\n"
       "popl %%eax\n"
       :
       :
       :"eax");

}

int multi(int first, int second){
   printf("function running\n");
   return first*second;
}


Writing the values directly onto the stack without altering the stack pointer works as you've no requirement to tidy the stack.

Aunxx.


Top
 Profile  
 
PostPosted: Tue Mar 10, 2009 8:47 am 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I am coming to realize that os x simply does not work the same way linux does. I pasted the above code exactly as it is into a vi terminal, and then compiled it. it compiled alright, but then segfaults immediately after "function running". gcc on os x works very differently than on intel processors, I borrowed a buddy's linux machine and did a gcc -S with the previous code, and then did the same on my machine, I compared the raw assembly with a comparator program I have and there were only a few lines that really matched up.

I think part of it is that os x forces -fPIC compliance, which i believe causes problems any time you manually mess with the stack or ebx... any time you use either of these you have to return it to it's original state before you reach the end of your block of assembly... in either case, linux assembly does not work without some tweaking....

edit: I really should install linux on this thing so that I can compare things.


Top
 Profile  
 
PostPosted: Tue Mar 10, 2009 10:28 pm 
Offline
coder

Joined: Wed Oct 01, 2008 2:55 pm
Posts: 103
Location: Michigan
I committed another change, which is mostly a work around for a texture problem (it's ifdef'd so it only effects os x), but after this change, I have not see any crashes, which I believe means things are pretty much stable. Its been a long ride, but Homeworld SDL is pretty much stable on osx86. :mrgreen:


Top
 Profile  
 
PostPosted: Wed Mar 11, 2009 2:47 am 
Offline
coder

Joined: Tue Nov 07, 2006 4:40 am
Posts: 236
Cool, and congratulations on getting there!

So, you just need to play through once or twice to make sure it's stable! :D

Aunxx.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 55 posts ]  Go to page 1, 2  Next

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group