SourceCode

From RackTables Wiki
Jump to navigation Jump to search

the source code repository

RackTables project uses git as version control system for the source code. The git repository that is used to make "tar.gz" releases is currently hosted on GitHub. It includes the following notable branches:

master
The main development branch, that is, the default branch for all new changes.
maintenance-0.16.x
maintenance-0.17.x
maintenance-0.18.x
maintenance-0.19.x
Archived maintenance branches, which don't accept any new changes.
maintenance-0.20.x
Current maintenance branche, that is, the stable branch that serves as the base for normal RackTables releases.

Use the following commands to get a working copy of the repository:

# from the RackTables project repository
git clone git://github.com/RackTables/racktables.git
# or
git clone https://github.com/RackTables/racktables.git
# or
git clone ssh://git@github.com:RackTables/racktables.git

# from a GitHub fork (of your own or not)
git clone git://github.com/username/racktables.git
# or
git clone https://github.com/username/racktables.git
# or
git clone ssh://git@github.com:username/racktables.git

Note that the first two schemas (git:// and https://) are always read-only and the third (ssh://) may be read-only or read-write.

Use the following commands to examine local unsaved changes:

git status
git diff

The recommended way of contributing the work is to:

  1. fork RackTables on GitHub,
  2. push finished commits to the fork, and
  3. create a pull request.

In the case of a private repository use a git format-patch command to produce a patch and send it to the developers. When sending a patch, please make sure your branch is rebased onto the latest respective (master or maintenance) branch.

PHP code style guide

Almost all PHP source code in RackTables is original and the following style guide applies:

use Allman indent style.

Why: Allman style is very diff-friendly when it comes to control structures. In Allman style expansion of a single inner operator into a larger code block (and vice versa) leaves both the inner (controlled) and the outer (controlling) operator intact:

 for ($i = 0; $i < 10; $i++)
+{
 	doSomething ($i);
+	doSomethingElse();
+}

This works seamlessly at any indentation level and for all control structures (if/for/foreach/while), however deeply nested in each other. With careful indentation of case within switch it will work same well:

switch ($var)
{
	case 1:
		doSomething();
		break;
	case 2:
-	{
-		doSomethingElse();
		return 3;
-	}
	case 3:
+	{
+		doSomethingCompletelyDifferent();
		return 4;
+	}
}

Being diff-friendly means being friendly to a version control system (Subversion in the past, git at present). Neither the VCS nor the Allman style alone guarantee one's commits to be focused and clean. However, Allman style encourages commits that change only the lines that really need to be changed.

Exception: inline (PHP heredoc) HTML, SQL and JS code.

use exactly one tab per indentation level

Why: different people prefer different (2, 4, 8...) amount of "spaces" per indentation level. Some projects use space indentation and force everyone to hit Space (Backspace) N times each time the developer needs to indent (unindent), regardless of their own preferred value of N. RackTables makes use of Tab and enables every developer to configure their editor for their preferred "tab width" (the screenshot below stands for tab width 2). Note that the tab width must stand for the amount of spaces the editor displays, but not generates instead of, that is, in the source file tabs must remain tabs.

Exception: inline (PHP heredoc) HTML, SQL and JS code.

Interface tab2.png

wrap lines longer than 100-120 characters

Why: readability. Prior experience tells that old-school 80-columns margin just doesn't work with descriptive naming of variables and functions (i.e. descriptive == heavily wrapped). On the other hand, avoid making a readable, 200-characters long line even if it fits your monitor. Use good judgement.

consider using indentation/alignment for function arguments

Why: the purpose is twofold. On one hand it allows to pack a lot of arguments and expressions into a single function call without losing track of what is being done and no temporary variables:

		usePreparedUpdateBlade
		(
			'Port',
			array
			(
				'name' => $port_name,
				'type' => $port_type_id,
				'label' => $port_label,
				'reservation_comment' => $reservation_comment,
				'l2address' => nullEmptyStr ($db_l2address),
			),
			array
			(
				'id' => $port_id,
				'object_id' => $object_id
			)
		);

On the other hand, it allows to group function arguments add/or add comments to them:

function get8021QSyncOptions
(
	$vswitch,
	$D, // desired config
	$C, // cached config
	$R  // running-config
)
{
	$default_port = array
	(
		'mode' => 'access',
		'allowed' => array (VLAN_DFL_ID),
		'native' => VLAN_DFL_ID,
	);
	$ret = array();

Use good judgement.

consider using indentation/alignment for arrays

Why: due to a few reasons. First, multiline array assignment/initialization can accommodate a notable amount of supplementary expressions:

$ret[$pn] = array
(
	'status' => 'martian_conflict',
	'left' => array_key_exists ($pn, $C) ? $C[$pn] : array ('mode' => 'none'),
	'right' => array_key_exists ($pn, $R) ? $R[$pn] : array ('mode' => 'none'),
);

Second, updates to arrays that serve as lists produce cleaner diffs, for example:

--- a/wwwroot/inc/remote.php
+++ b/wwwroot/inc/remote.php
@@ -64,6 +64,7 @@ $breedfunc = array
 	'jun10-get8021q-main'      => 'jun10Read8021QConfig',
 	'jun10-xlatepushq-main'    => 'jun10TranslatePushQueue',
 	'jun10-getallconf-main'    => 'jun10SpotConfigText',
+	'jun10-getlldpstatus-main' => 'jun10ReadLLDPStatus',
 	'ftos8-xlatepushq-main'    => 'ftos8TranslatePushQueue',
 	'ftos8-getlldpstatus-main' => 'ftos8ReadLLDPStatus',
 	'ftos8-getmaclist-main'    => 'ftos8ReadMacList',

Finally, it makes tree-like array initialization more transparent:

$SQLSchema = array
(
	'object' => array
	(
		'table' => 'RackObject',
		'columns' => array
		(
			'id' => 'id',
			'name' => 'name',
			'label' => 'label',
			'asset_no' => 'asset_no',
			'objtype_id' => 'objtype_id',
			'rack_id' => '(SELECT MIN(rack_id) FROM RackSpace WHERE object_id = RackObject.id)',
			'rack_id_2' => "(SELECT MIN(parent_entity_id) FROM EntityLink WHERE child_entity_type='object' AND child_entity_id = RackObject.id AND parent_entity_type = 'rack')",
			'container_id' => "(SELECT MIN(parent_entity_id) FROM EntityLink WHERE child_entity_type='object' AND child_entity_id = RackObject.id AND parent_entity_type = 'object')",
			'container_name' => '(SELECT name FROM RackObject WHERE id = container_id)',
			'container_objtype_id' => '(SELECT objtype_id FROM RackObject WHERE id = container_id)',
			'has_problems' => 'has_problems',
			'comment' => 'comment',
			'nports' => '(SELECT COUNT(*) FROM Port WHERE object_id = RackObject.id)',
			'8021q_domain_id' => '(SELECT domain_id FROM VLANSwitch WHERE object_id = id LIMIT 1)',
			'8021q_template_id' => '(SELECT template_id FROM VLANSwitch WHERE object_id = id LIMIT 1)',
		),
		'keycolumn' => 'id',
		'ordcolumns' => array ('RackObject.name'),
	),

put else near its if

When an if operator includes the else part, consider arranging the branches so that the short branch belongs to the if and the long branch belongs to the else. Why: it helps following the code.

Holub, Allen (1995). Enough Rope to Shoot Yourself in the Foot. McGraw-Hill. ISBN 978-0-07-029689-3 "Section 57: Put the shortest clause of an if/else on top."

make conscise use of the conditional ternary operator

Why: to use the language element in making the code look closer to what the code is intended to do. First, don't reinvent the conditional ternary operator with if like this:

  if ($row !== FALSE)
    return $row[0];
  else
    return NULL;

That block of code returns a value in either case, exactly one return would do the job:

  return $row === FALSE ? NULL : $row[0];

Second, don't nest ternary conditionals, it is hard to read:

  $ret = ($ret > 0 ? 1 : ($ret < 0 ? -1 : 0));

For that specific case use numSign().

Finally, when a standalone ternary conditional isn't sufficient or is long/difficult to read, use temporary variables or necessary amount of if's.

things to consider when adding a new function

  • First, is it possible to use one of the existing functions or a combination of these?
  • Carefully decide which file the function should be in (depends on its purpose and relation to other functions).
  • Consider headlessCamelCase for the function name.
  • Try to keep functions focused, that is, small, doing just one thing but doing it very well.
  • When adding a new function, add unit test(s) for it as well.

put $self = __FUNCTION__; at the beginning of a recursive function

Why: to make it easier to see a recursive function. Such a function should call itself as $self(...) so that if the function is renamed later its code requires no changes to work.

# Return a list consisting of tag ID of the given tree node and IDs of all
# nodes it contains.
function getTagIDListForNode ($treenode)
{
	$self = __FUNCTION__;
	$ret = array ($treenode['id']);
	foreach ($treenode['kids'] as $item)
		$ret = array_merge ($ret, $self ($item));
	return $ret;
}

prefer // over # for single-line comments

Why: it saves keystokes (double-press one key versus pressing two different keys simultaneously). However, when adding a comment to a code that already uses #, be consistent with what is already there. Also, the code contains ~ 2000 of //-style lines versus ~450 of #-style.

don't use foreach() by reference

Why: this valid language construct has a side effect that complicates things. Consider the following code:

$a = array (1 => 'a', 2 => 'b', 3 => 'c');
foreach ($a as &$tmp)
  $tmp = $tmp . '_str';

So far so good, but once you decide to reuse $tmp and assign a new value to it, you will [unexpectedly] modify the array ($tmp still holds the reference to the last element). This thread discusses the workings of this at length. The workaround would be to use unset ($tmp) after the foreach-by-reference, but the least-surprising solution is to iterate over array keys if you need to modify array elements and array_walk() is not sufficient:

foreach (array_keys ($a) as $k)
  $a[$k] = $a[$k] . '_str';

use double-quotes for strings only when necessary

Why: it saves the time required to tell "just strings" from strings that embed expansions. Consider the following:

$string1 = 'Single-quotes give a hint that this string is just a string.';
$string2 = "Double-quotes give a hint that this string not just _may_ contain " .
  "expansions but it actually does: here are $var1 and ${var2}";

use array_key_exists() instead of isset() to test a key in an array

Why: with isset() it is impossible to distinguish a missing element from a typo in the array name:

$arr = array (1 => 'a', 2 => 'b');
if (array_key_exists (10, $arr)) # FALSE
  echo 'OK';
if (array_key_exists (10, $ar)) # FALSE and a PHP warning (note the typo)
  echo 'OK';
if (isset ($arr[10])) # FALSE
  echo 'OK';
if (isset ($ar[10])) # FALSE and no warning
  echo 'OK';

don't use language features of PHP 5.3+

Why: because some users still use PHP 5.2. This includes closures (lambda functions) and a number of functions introduced in PHP 5.3.

use "&&" and "||", not "and" and "or"

Why: because (unlike, for instance, C++) those are not just alternative notations but quite different operators. From the PHP operator precedence table it is possible to see that, for example, the following two lines of code work in a very different way:

$c = $a or $b; // Evaluate $a and assign the value to $c. If $c was set to false, evaluate $b and discard the value.
$c = $a || $b; // Evaluate $a. If the value is true, set $c to true. Otherwise evaluate $b and assign the value to $c.

Not everybody remembers about this difference and eventually interprets the code not as PHP actually executes it. As the gain of "and" and "or" in RackTables is negligible, the simplest solution is not to use them at all.

test for an empty string in the simplest way possible

Just compare with an empty string. Only use strlen() and mb_strlen() when those are really required, i.e. when it is necessary to know the exact length of a string or a multibyte string.

if ($s == '') // OK
  ...
if ($s != '') // OK
  ...
if (strlen ($s)) // overkill
  ...
if (! strlen ($s)) // overkill
  ...
if (mb_strlen ($s)) // overkill
  ...
if (! mb_strlen ($s)) // overkill
  ...

SQL code style guide

use uppercase for SQL keywords and wrap long lines

Why: readability

	$result = usePreparedSelectBlade
	(
		'SELECT FileLink.file_id, FileLink.id AS link_id, name, type, size, ctime, mtime, atime, comment ' .
		'FROM FileLink LEFT JOIN File ON FileLink.file_id = File.id ' .
		'WHERE FileLink.entity_type = ? AND FileLink.entity_id = ? ORDER BY name',
		array ($entity_type, $entity_id)
	);

use SELECT with an explicit list of columns, don't use SELECT *

Why: this is very well explained in many sources, for instance, here and here.