SugarCRM CE Codebase Improvements Wishlist

Since SugarCRM CE has now been forked as SuiteCRM, given the uncertainly of its ongoing support from the company behind it, I thought it time to start putting my wishlist codebase improvements that will help make it a lot more maintainable and usable. These are simply things that I have found a royal pain to work with, and hopefully there will now be an opportunity to start putting these things right.

These things are holding SugarCRM CE back. Without addresses these issues, and more, the application will not be able to advance.

My list, in no particular order:

Remove PHP Closing Tags

Every PHP script ends with a ?> tag. Stop doing this! A pure PHP script with a closing ?> tag is just going to cause problems as white-space that follows it becomes part of the output. PSR-2 also says, don’t do this. They should all be removed. Just draw a line under it, remove them all in one go, and carry on without them.

Use an Autoloader

There are classes all over the place in SugarCRM. They are included in various ways, in various places. Finding a class is not easy, as there are many conventions used for locating and naming them. This could all be standardised into one autoloader. I would go straight for PSR-4, which is in its final voting phase as I write this.

This is not trivial, as there will need to be a bit of a shake-up of the classes in the project. However, it would simplify things in the long run.

With nearly 600 instances of include(), 100 of include_once(), 300 require() and nearly 4000 instances of require_once(), you begin to ask yourself just how sustainable this stuff is. Something has gone very wrong there.

Use Composer

Whether composer and packagist specifically are here to stay for the long term, the idea of shared code, and dependencies that are automatically managed, is here and a great idea. Laravel 4 did it – much of its inner core was ripped out and put into separate packages and libraries, meaning that installing an instance of Laravel now means pulling all these external packages together into one place, and the composer dependency management handles that.

The main point of this is to stop reinventing the wheel. There are well-used, well-understood and well-maintained packages for handling autoloading, logging, HTTP and REST APIs, validation, database ORMs, time/date handling — the list goes on. SugarCRM has tended to reinvent all of these. Through them all away and use the excellent alternatives that are available off-the-shelf.

Stop Writing PHP Scripts On-the-fly

SugarCRM can be evil at times. It writes PHP scripts everywhere. No directory is safe from scripts being written on-the-fly, compbined together, included from the cache folder – it is a mess.

As well as making the ability to secure files and directories incredibly near impossible, it also makes setting up code repositories very difficult, as you never really know which PHP scripts to store in the repository and which are going to be regenerated.

On top of this, many cloud hosting services, such as Google App Engine, do not allow PHP to be written and run by the application. If PHP needs to be generated, then it needs to be done offline and uploaded, and that can only be done if it is well-managed in one place rather than spread right across all the directories in the application.

I have to say this: having to set file and directory permissions of the entire SugarCRM application to 775 is simply a terrible idea. But that is the way it is designed to work. Once a hacker is in there, they can have a party with your data, scripts and users. It honestly scares the fuck out me when I have to issue the command to open up write access for apache for just about everything.

Better Organisation of the Cache Directory

It is called a cache directory, but it far from being a cache directory. For example, the image uploads directory is one, big, flat directory. Once you have users archiving thousands of their sales emails, complete with documents and company logos in, the cache/upload/ directory will be unmanageably large. It is crazy – it needs some structure.

On top of this, some cloud services again, do not let you simply write files to the local filesystem. The reason for this is that many services simply do not have a local, writeable filesystem – your code could be running on a server anywhere in the world, local to the user accessing it. So, all file access on SugarCRM needs to go through a single file handling library to translate the file requirements to whatever underlying technology the hosting provides. If SugarCRM is not going to be stuck on expensive dedicated servers, then this is important.

Old Style PHP Breaks on 5.4

Running SugarCRM under PHP5.4 is an easy way to fill up your apache error log. There are so many “old and wrong” ways of doing things that coders have got away with for years, that they are difficult to count. These are things that have always been wrong, but which PHP allows you to get away with. PHP 5.4 draws a line in the sand and raised E_STRICT errors on these. You can turn E_STRICT off (mostly – see entry points below) but that defeats the object. A PHP application should not be generating errors or warnings when it rules.

Examples include checking the values of variables and array elements before checking whether they are set in the first place; extending classes and changing the parameter types of its methods, and general use of many deprecated functions.

Too Many Entry Points

There are many entry points in SugarCRM. Each one sets up the environment and globals independently, and usually in a slightly different way to all the other entry points. If, for example, you want to switch off E_STRICT error checking, you need to locate all the places that is turned on in each entry point and hack the code at those points. There will always be additional places where your settings get overridden.

It makes managing SugarCRM very hard. The argument is that each entry point has different requirements and so will need different libraries loaded up and you don’t want to load too much redundant code in some of the lighter AJAX functions. I don’t buy that. The initial stage where routing can be handled does not need to load up anything until it knows what is handling a request.

Input is Read in Too Many Places

A quick scan of an install gave me over 6000 instances of $_REQUEST being used. That is plain ridiculous. Put ALL input through one handler, get it validated through an input validation package, and do it properly. When I see stuff like this in the code, I shudder with fright – it truly scares me how many vulnerabilities there are to be discovered:

$query .= "WHERE document_revisions.id = '" . $db->quote($_REQUEST['id']) ."'";
$download_location = $GLOBALS['sugar_config']['upload_dir']."/".$_REQUEST['id'];
echo $email->getNamePlusEmailAddressesForCompose($_REQUEST['action_module'], (explode(",", $_REQUEST['uid'])));

There are thousands and thousands of these.

Dealing with Globals

The SugarCRM code is littered with references to $GLOBAL. For example, the logger is used like this:

$GLOBALS['log']->debug('Debugging the thing: ' . $item);

That locks the system into  the use of the $GLOBALS. It also locks everything into assuming a global variable called $log will be set up with the logging class. The way globals are accessed in PHP has changed subtly over the years. Many scripts from ten years ago simply won’t work today because globals that were expected to be around have been remo0ved.

Laravel offers an alternative. Instead of a global, an alias is set up for a provider class that delivers the debugger. The above could look something like this:

Log::debug('Debugging the thing: ' . $item);

Other logging classes can easily be switched in through the provider. By going through a provider, it also means the globally-used classes can be instantiated only when first used. Those classes that are not used, do not take up any time or memory resources.

On top of this, instead of logging strings that have to be constructed in the code doing the logging, use place-holders:

Log::debug('Debugging the thing: #1', $item);

There are many ways placeholders can be implemented, so keep an eye on what others are doing, especially the PSR efforts to standardise logging interfaces. This will also allow the opportunity to parse the parameters – serialise objects, pretty-format SQL and JSON, expand unix format timestamps into more understandable dates, and so on. Also placeholder strings are the only way it is going to be possible to provide language translations of the log text.

I found 5900+ references to $GLOBALS in the 6.2 codebase. Globals are not limited to “log” – that is just one example. We also have dictionary, em_package_list, sugar_config, sugar_version, db, beanFiles – it is a massive list that goes on.

Not All Dates Are Localised

This one goes right down to the architecture of the system. A handy feature is that all timestamps are localised. If you store a time and date into a timestamp field, then it will be converted from your time zone into GMT, and stored in the database. Anyone looking at the date in any page, will find that the GMT time is extracted from the database, then converted to the time that would represent in their time zone.

Now that is handy, for situations such as booking telephone meetings, until you don’t want it to work like that. Unfortunately the time zone conversion is done at such a low level in the system, that it is almost impossible to override or disable for a given field.

Examples of when you may not want this to happen, is when storing flight details, which are always shown in the departure or arrival locales, regardless of who is looking at it. Other dates and times such as physical meeting times, are useful to be shown in local time too, but then it would be useful being able to specify what that local time zone is. If I need to be on a plane for a meeting at 10am, I need to know that is 10am at the destination, and not 10am when converted to my current time zone, otherwise I may miss that meeting.

My beef here is that the time zone conversion is done in the wrong place. The database layer needs to deliver the recorded time, and the time zone it refers to, and let the field decide how to handle that.

Other Stuff

I’ll add further things as I come across them. SugarCRM CE and SuiteCRM have great promise, but need a real shake-up to remove the software rot and dead wood they have built up over the years. With SuiteCRM, I believe there is a good chance of this happening.

What are your pet-hates with the SugarCRM code base? What do you find makes the simplest of changes turn into major headaches that really should not be so?

3 Responses to SugarCRM CE Codebase Improvements Wishlist

  1. Raul 2013-12-12 at 19:52 #

    This wish list is great, It’ll make my life easier if suitecrm were at least php 5.4 compatible, we’re at php5.5 right now,
    I’m deploying on 5.4 and looking forward 5.5, having a legacy app running php <5.4 is a no go.

    I see great potential in oro-crm, but it's a huge application, and it's on beta yet.
    Though, I was never a fan of the eav model.

    In the end, I believe we have two great candidates. And I'm about to lean on suitecrm for my next project.
    We'll see.

    • Jason Judge 2013-12-12 at 20:00 #

      I’m on 5.4 with SugarCRM 6.x, and speed the error logs fill up is frightening (we are talking Gigibytes over a few months). We don’t have to make use of any special PHP5.4/5.5-only features just to make it work without breaking the rules.

      I noticed too that the SugarCRM 7.x line is being led by a new dev that has been evangelising the use of composer for package dependency management. Hopefully that will mean a more modular system based on more open source libraries (which it may or may not be), but it would certainly shake up the codebase and break a lot of things (which will unfortunately be necessary to improve it in the long run).

      Here is the post announced by Jon Mertic, who is “transitioning” (aka moving on):

      https://groups.google.com/forum/#!topic/php-fig/GrLYJaH1ldo

      I think it is going to be very important to watch SugarCRM *very* closely, so if there is a divergence between SugarCRM and SuiteCRM, it is one where SuiteCRM is not left behind doing things the old way. If that happened, then I think the fork would be complete and the two would not longer track each other. That’s not a necessarily bad thing, but people do need to be aware it is a possibility and what the consequences are.

    • Jason Judge 2013-12-14 at 00:09 #

      I think the EAV model is nice from a theoretical view, but in the real world, it takes the data structures just one step too far away from what we can grasp and see and manipulate by hand. ORO-CRM looks nice, and I will explore that a bit. I guess it turns into a Magento – really flexible, but needs loads of resources to run, and you cannot look at or manipulate the data without writing a program to do so. I do like old-fashioned relational tables with lots of columns to hold the data. I have used them many times on SugarCRM to directly fix data issues with simple tools

      However, an API-based black-box for holding data and relations in a flexible way, may just be the key to the future of CRMs. Just so long as it is reliable and tools for manually fixing data are good.

      Best of luck with your project.

Leave a Reply