I think that whole MT issue can be discussed from two angles;
1) From TECHNICAL point of view.
2) From DEVELOPER (End-Users) point of view.
As I am not thoroughly equipped with technical part, I present hereunder Przemek’s ( one who implemented MT in Harbour ) response which he posted on Harbour’s Dev-List.
1. TECHNICAL PART
Xbase++ Steffen :
Sorry to use your blog to comment on the comments of others, as I think this is not the idea of a blog. Anyway please allow me to clarify my statement. I said "clean and easy to use way of multithreading", i didn’t say the Harbours don’t support Multithreading.
My statement is still true, even so Harbour and xHarbour have implemented the ability to execute code in multiple threads and have implemented some of the interfaces Xbase++ provides in one way or another. They are still far away from the idea and concepts of Xbase++ in that area.
In addition Harbour and xHarbour implemented different semantics of isolation. Which makes porting of complex MT applications for sure a mess. Let me clarify that.
Przemek:
It’s not true.
1-st Harbour and xHarbour use different MT implementation and they should not be confused.
In Harbour it’s possible to use threads like in xbase++. It support thread and signal classes/objects, the concept of zero zone for open workareas (dbRequest()/dbRelease()), SYNC object methods, isolated SET and RDD settings, coping or sharing memvars with parent threads. Also xbase++ thread related functions like ThreadID(), ThreadObject(), ThreadWait(), ThreadWaitAll() are supported.
It’s highly possible that I haven’t replicated xbase++ behavior exactly (f.e. the implementation of THREAD class should be extended yet to add support for thread restarting when thread interval is set but it’s rather simple .prg code and I hope that xbase++ users which are interested in exact emulation will make it. I’m not xbase++ user so I cannot easy test the details of implementation.
Harbour does not support thread priority but it’s not multiplatform and portable feature so it cannot be well implemented. Anyhow in few lines it can be added for those platforms which are supported by xbase++.
But in Harbour you can also use other things which does not exists in xbase++. The very important is also scalability which is far much better then in xbase++ or in xHarbour so if you have multi CPU machine you can expect that the speed of MT application will be noticeable improved.
The mutexes in Harbour give very flexible synchronization mechanism. They can be used as message queues, conditional variables or simple mutexes. PRIVATE and PUBLIC sharing or coping is optional and controlled by user. It’s possible to allocate many console windows in single thread or in many threads. Console windows can be shared between threads or can be dedicated to single thread only.
Harbour supports THREAD STATIC variables and they are used in core code. It means that Clipper’s code which need static variables like getlist implementation is MT safe in Harbour. It also gives very powerful mechanism for MT programmers. There are also many other things related to MT programming which seems to be unique to Harbour and does not exist in xbase++.
In summary Harbour offers xbase++ compatible MT API but rather as optional feature for programmers because it provides own more powerful and flexible one and the final applications are much better scalable.
Xbase++ Steffen
Multithreading as the ability to execute code in different code paths is a feature of modern OS sinces decades.
The problem with MT is that it adds another dimension of complexity to the developers task. While with single threaded apps. the developer needs only to think in a more or less sequential way with MT each execution path adds a new dimentions to the equation of programm complexity.
Development languages supporting MT such as Delphi, .NET (C#,VB) or Harbour and xHarbour support MT thats correct, but they do not remove the burden of correctness from the programmer. It is in the sole responsibility of the programmer to ensure programm correctness in two different areas; data-consistency and algorithm isolation.
Przemek:
I agree,
Xbase++ Steffen
The problem of data consistency occurs as soon as more than one thread is accessing the same data - such as a simple string or an array.
Besides nuances in terms of single or multiple readers/writers the consistency of the data must be ensured, so developers are forced to use mutex-semaphores or other higher level concepts such monitors, guards... to ensure data-consistency.
Przemek:
Yes, usually they are though different languages gives some additional protection mechanisms here so not always is necessary to use user level synchronization.
Xbase++ Steffen
Algorithm isolation is somewhat related to data-consistency, it becomes obvious that a linked-list accessed from multiple threads must be protected otherwise dangling pointer occurs. But what about a table/relation of a database.
The problem here is that concurrency inside the process can be resolved - but this type of "isolation" does break the semantics of the isolation principles which are already provided by the underlying dbms (sql-isolation-levels, record or file locks, transactions). Therefore algorithm isolation/correctness is a complete different beast as it is located at a very high semantic level of the task.
Przemek:
yes, it is.
Xbase++ Steffen
Alaska Software has put an enormous amount of research efforts into that area and we have more than a decade of practical experience with that area based on real world customers and real world applications.
From that point of view I would like to reiterate my initial statement "As of today there is still no tool available in the market which provides that clean and easy to use way of multithreading".
Przemek:
I was not making such "enormous amount of research efforts" ;-) Just simply looked at good balance between performance, basic protection and flexibility for programmers.
Xbase++ Steffen
Lets start with xHarbour, its MT implementation is not well thought, as it provides MT features to the programmer without any model, just the features. xHarbour even allows the usage of a workarea from different threads which is a violation of fundamental dbms isolation principles.
In fact xHarbour is just a system language in the sense of MT and makes life not really easier compared with other system languages. Therefore there is no value in besides being able to do MT. Also keep in mind due to the historical burden of the VM and RT core the MT feature is implemented in a way making it impossible to scale in future multi-core scenarios (see later-note).
Przemek:
I agree. Giving the unprotected access to workareas is asking for a troubles. It can create very serious problems (f.e. data corruption in tables) and gives nothing for programmers because they have to use own protection mechanisms to access the tables so final application have to be reduced to the same level as using dbRequest()/dbRelease() to lock/unlock the table. The difference is only that in such model programmer has to implement everything itself.
Xbase++ Steffen
Harbour is better here because if follows more the principles of Xbase++, while I am not sure if the Harbour people have decided to adapt the Xbase++ model for compatibility reasons or not I am glad to see that they followed our models point of view.
The issues with Harbour however is that it suffers from the shortcoming of its runtime in general, the VM design and of course the way how datatypes - the blood of a language - are handled. It is still in a 1980 architectual style centered around the original concept how Clipper did it.
This is also true for xHarbour, so both suffer from the fact that MT was added I think in 2007, while the VM and RT core is from 1999 - without having MT in mind.
Przemek:
Here I can agree only partially.
1-st Harbour does not follow xbase++ model. With the exception to xbase++ emulation level (xbase++ sync and thread classes, thread functions and sync methods) the whole code is the result of my own ideas.
The only one idea I partially borrowed is dbRequest()/dbRelase() semantic. Personally I wanted to introduce many workarea holders (not only single zero area zone) and dbDetach()/dbAttach() functions.
Later I heard about xbase++ implementation and I’ve found the cargo codeblock attaching as very nice feature so I implemented it but internally it operates on workarea sets from my original idea and still it’s possible to introduce support for multiple WA zones if we decide to add .prg level API for it. In some cases it maybe usable. Also the internal WA isolation in native RDDs is different. For POSIX systems it’s necessary to introduce file handle sharing and this mechanism is already used so now we can easy extended it adding support for pseudo exclusive mode (other threads will be able to access tables open in exclusive mode which is exclusive only for external programs) or add common to aliased WA caches. Of course Harbour supports also other xbase++ extensions but they were added rather for compatibility with xbase++ on xbase++ users and internally use basic Harbour MT API.
2-nd this old API from 1980 is a real problem in some places and probably will be good to change it. But I also do not find the xbase++ API as the only one final solution. Harbour gives full protection for read access to complex items. User have to protect only write access and only if he will want to change exactly the same item not complex item member, f.e. this code;
aVal[ threadID() ] += aVal[ threadID() ] * 2 + 100
is MT safe in Harbour even if the same aVal is used by many different threads.
Important is the fact that each thread operates on different aVal items and aVal is not resized. Otherwise it may cause data corruption. But when complex items can be resized the we usually need additional protection also in xbase++ because user code makes many operations which have to be atomic in some logical sense so in most of cases there is only one difference here between Harbour and xbase++; in xbase++ with full internal protection and missing user protection RT error is generated.
In Harbour it may cause internal data corruption. I agree here that it’s very important difference but in mouse of such cases we are talking about wrong user code which needs additional user protection in both languages. And here we have one fundamental question;
What is the cost of internal protection for scalability?
and if we can or cannot accept it. My personal feeling is that the cost will be high, even very high but I haven’t made any tests myself though some xbase++ users confirmed that it’s a problem in xbase++. I’m really interested in some scalability tests of xbase++ and Harbour. It could give few very important answers. If some xbase++ user can port tests/speedtst.prg to xbase++ then it will be very helpful.
Of course it’s possible that I missed something here but I’ve never used xbase++ and I cannot see its source code so I only guess how some things are implemented in this language.
Xbase++ Steffen
This is in fact one of the biggest differences between Xbase++ and the "Harbours" from a pure architectual point of view, we designed a runtime architecture from the beginning to be MT/MP and Distributed, they designed a runtime based on the DOS Clipper blueprint.
In fact, I could argue on and on, specifically it it comes to dedicated implementations of the Harbour runtime core or the Harbour VM but sharing these type of technical details is of course definitively not what I am paid for -;) Anyway allow me to make it clear in a general terms.
Przemek:
See above. It’s not such clear as you said.
I think that you will find users which can say that the cost of scalability is definitively not what they be paid for. Especially when the missing user protection is also problem for xbase++ and the bad results are only different.
For sure RT error is much better then internal data corruption but how much users can paid for such functionality.
Xbase++ Steffen
First, any feature/functionality of Xbase++ is reentrant there is not a single exception of this rule.
Second, any datatype and storage type is thread-safe regardless of its complexity so there is no way to crash an Xbase++ process using multithreading.
Third, the runtime guarantees that there is no possibility of a deadlock in terms of its internal state regardless what you are doing in different threads. There is a clean isolation and inheritance relationship of settings between different threads.
In practical terms that means, you can output to the console from different threads without any additional code, you can execute methods or access state of GUI (XbasePARTS) objects from different threads, you can create a codeblock which detaches a local variable and pass it to another thread, you are performing file I/O or executing a remote procedure call and in the meanwhile the async. garbagge collector cleans up your memory - and the list goes on...
But in Xbase++ you can do all that without the need to think about MT or ask a question such as "Is the ASort() function thread safe" or can I change the caption of a GUI control from another thread. Thats all a given, no restrictions apply, the runtime does it all automatically for you.
Przemek:
Most of the above is also true in Harbour with the exception to missing GUI components and obligatory internal item storage protection. But it’s the subject of efficiency discussed above.
Let’s make some scalability tests and we can decide if we want to pay the same cost of xbase++ users.
Xbase++ Steffen
Anyway, I like Harbour more than xHarbour in terms of MT support.
However the crux is still there, no real architecture around the product, leading to the fact that MT is supported form a technical point of view but not from a 4GL therefore leading to a potential of unnecessary burden for the average programmers, and of course that was and is still not the idea of Clipper as a tool.
Przemek:
The only one fundamental difference between Harbour and xbase++ in the above is obligatory internal items protection. At least visible for me now and as I said the cost of such functionality may not be acceptable for users. But let’s make some real tests to see how big problem it creates in real life.
Xbase++ Steffen
Btw, the same is true for VO or so, they left the idea of the language and moved to something more like a system -language, while I can understand that somewhat I strongly disagree with that type of language design for a simple reasons; its not practical in the long term - we will see that in the following years as more and more multi core system will find their way in the mainstream and developers need to make use of them for performance and scaleability reasons.
In 10 - 15 years from now we will have 100 if not thousands cores per die - handling multithreading , synchronisation issues by hand becomes then impossible, the same is true for offloading tasks for performance reasons.
So there is a need for a clean model in terms of the language - thats at least into what we believe at Alaska Software. It goes even further, the current attempty by MS in terms of multicore support with VS2010 or NET 4.0 are IMO absolutely wrong, as they force the developer to write code depending on the underlaying execution infrastructure alias cores available.
In other words, infrastructure related code/algorithms get mixed with the original algorithm the developers writes and of course the developer gets payed for. Thats a catastrophic path which for sure does not contribute to increased productivity and reliability of software solutions.
Przemek:
I agree with you only partially. Over some reasonable cost limit the MT programing stops to be usable and is much more efficient, safer and easier to use separated processes.
The cost of data exchanging between them will be simply smaller the cost of internal obligatory MT synchronization. So why to use MT mode? For marketing reasons?
Xbase++ Steffen
Funnily enough, the most critical, and most difficult aspect in that area; getting performance gains from multi core usage is even not touched with my technical arguments right now.
However it adds another dimension of complexity to the previous equation as it needs to take into account the memory hierarchy which must be handled by a 4GL runtime totally different as it is with the simple approach of Harbour/xHarbour. Their RT core and VM needs a more or less complete rewrite and redesign to go that path
Przemek:
I do not see bigger problems with Harbour core code modifications. If we decide that it’s worth then I’ll implement it.
Probably the real problem will be forcing different API to 3-rd party developers. Here we probably should chose something close to xbase++ C API to not introduce additional problems for 3-rd party developers which have to create code for both projects to have some basic ompatibility f.e. at C preprocessor level.
Anyhow I’m still not sure I want to pay for the cost of full item access serialization.
Xbase++ Steffen
In other words, Xbase++ is playing in the Multithreading ballpark since a decade.
Harbour is still finding its way into the MT ballpark while xHarbour is in that context at a dead-end.
I would bet that Xbase++ will play in the multicore ballpack while the Harbours are still with their MT stuff.
Przemek:
And it’s highly possible that it will happen. But Harbour is free project and if we decide that adding full item protection with the cost of speed is valuable feature then maybe we implement it.
It’s also possible that we add such functionality as alternative VM library. Just like now we have hbvm and hbvmmt we will have hbvmpmt (protected mt).
Xbase++ Steffen
In a more theoretical sense, it is important to understand that a programming language and its infrastructure shall not adapt any technical feature, requirement or hype. Because then the language and infrastucture are getting more and more complicated up to an point of lost control.
Also backward compatibility and therefore protection of existing investments becomes more and more a mess with Q&A costs going through the roof.
Przemek:
_FULLY_AGREE_. Things should be as simple as possible. Any hacks or workarounds for single features in longer terms create serious problems and blocks farther developing. For me it was the main of xHarbour problem when I was working on this project.
Xbase++ Steffen
Nor is it a good idea to provide software developers any freedom - the point here is, a good MT modell does smoothly guide the developer through the hurdels and most of the time is even not in the awareness of the developer.
The contrary is providing the developer all freedom, but this leads to letting him first build the gun-powder, then the gun to finally shoot a bullet -;)
Przemek:
;-)
Xbase++ Steffen
Therefore let me rephrase my initial statement to be more specific; As of today there is still no tool available in the market which provides that clean and easy to use way of multithreading, however there are other tools which support MT - but they support it just as an technical feature without a modell and thats simple wrong as it leads to additional dimensions in code complexity - finally ending in applications with lesser reliability and overall quality. Just my point of view on that subject - enough said
Przemek:
Thank you very much for this very interesting text. I hope that now the main internal difference between Harbour and xbase++ is well visible for users. To the above we should add yet tests/speedtst.prg results to compare scalability so we will know the real cost which is important part of the above description. I’m very interesting in real life results and I hope that some xbase++ users will port tests/speedtst.prg to xbase++ so we can compare the results.
Pritpal Bedi :
So, for those who were eager to understand underlying concepts of MT and how it is woven in these products, must be feeling at ease with above discussion. Believe me, me also found it very rewarding.
2. DEVELOPER (END-USER) PART
I will turn-up to it some other time.
Regards
Pritpal Bedi
No comments:
Post a Comment