TPoint — Тип Delphi

Массивы в Delphi (структурные типы данных)

Массивом называется упорядоченная индексированная совокупность однотипных элементов, имеющих общее имя. Элементами массива могут быть данные различных типов, включая структурированные. Каждый элемент массива однозначно определяется именем массива и индексом (номером этого элемента в массиве) или индексами, если массив многомерный. Для обращения к отдельному элементу массива указываются имя этого массива и индекс (индексы) элемента, заключенный в квадратные скобки, например, arr1 [3, 35], arr1 [3] [35] или аrr3 [7].

Количество индексных позиций определяется размерностью массива (одномерный, двумерный и т. д.), при этом размерность не ограничивается. В математике аналогом одномерного массива является вектор, а двумерного— матрица. Индексы элементов массива должны принадлежать порядковому типу. Разные индексы одного и того же массива могут иметь различные типы. Чаще всего индекс имеет целочисленный тип.

Различают массивы статические и динамические.

Статический массив

Статический массив представляет собой массив, границы индексов и, соответственно, размеры которого задаются при объявлении, т. е. они известны еще до компиляции программы. Формат описания типа статического массива:

Keeper’s blog

== I’m starting with the man in the mirror ==

Social Icons

Pages

пятница, 29 июля 2011 г.

Используем дженерики в Delphi! — Часть 1 (Введение)

[Содержание]
[Часть 1 — Введение в дженерики] [Часть 2 — Системные классы] [Часть 3 — Приложение]

  1. Что такое дженерики и зачем они нужны?
  2. Преимущества использования дженериков
    1. Безопасность типов
    2. Эффективность
    3. Максимальное повторное использование кода
  3. Встроенные обощенные классы в Delphi
  4. Что «поддается обобщению» в Delphi?
    1. Обобщенные методы
    2. Обобщенные классы
    3. Обобщенные записи
  5. Заключение

1. Что такое дженерики и зачем они нужны?
Наличие обобщений в языке позволяет создавать открытые типы, которые превращаются в закрытые на этапе компиляции. Синтаксис дженериков на примере обобщенной записи TPoint приведен в Листинге 1:
Листинг 1 — Объявление обобщенной записи TPoint
Сразу бросаются в глаза отличия от декларирования обычной записи — наличие в имени записи и кооринат X и Y этого же типа T . T здесь — неуточненный тип, который будет указан позже, при создании конкретного экземпляра записи.
Предположим, что мы решили использовать в приложении «дробные» точки (например, Double ). Все, что нужно сделать — объявить следующий закрытый тип:
Листинг 2 — Использование обобщенной записи TPoint в качестве «дробной» точки
А если нам понадобится целый тип, мы просто изменим Double на Integer :
Листинг 3 — Использование обобщенной записи TPoint в качестве «целой» точки
Просто, не правда ли? MyPoint: TPoint и MyPoint: TPoint — уже являются закрытыми типами и подчиняются все правилам, справедливым для обычных, необобщенных типов.

Может возникнуть вопрос: могу ли я сделать это без дженериков? Конечно, можете. Правда, лишитесь ряда преимуществ.

2. Преимущества использования дженериков
2.1. Безопасность типов
Когда необходимо повысить безопасность типов и избежать ошибок их несоответствия во время выполнения приложения — дженерики могут прийти на помощь. Для демонстрации сравним стандартный класс TList и его обобщенный «аналог» TList . Как известно, TList хранит массив указателей на объекты, причем тип этих объектов может быть различен. Рассмотрим следующий пример:
Листинг 4 — Вызов метода класса TCustomer для элементов TList
Теперь представьте, что передаваемый TList содержит не только экземпляры TCustomer . Для PrintCustomersInfo это будет катострофично и приведет к Invalid Type Cast , в процедуре PrintCustomersInfo2 мы избежали этого путем дополнительных проверок.

Но разве не замечательно бы было отдать такие проверки на откуп компилятору при сборке приложения? Дженерики позволяют это сделать:
Листинг 5 — Вывод информации о клиентах через TList
Заметили, что код уменьшился и стал более читаемым? Кроме того, за тем, чтобы в TList не попало ничего лишнего уже проследил компилятор.
2.2. Эффективность
Дополнительная эффективность при использовании дженериков — возможно, одно из главных их преимуществ. Обобщения предоставляют компилятору больше информации, не исключая данные о типе во время исполнения приложения. Такой код проще писать, эффективнее заниматься отладкой приложения. Кроме того, в рассматриваемом примере ассемблерный код с дженериками ( PrintCustomersInfo3 ) содержит до 10 инструкций меньше (по сравнению с PrintCustomersInfo2 ).
2.3. Максимальное повторное использование кода
Обобщенный класс, код для которого был написан всего 1 раз, может использоваться многократно. Так, без переписывания кода, TList может быть использован для создания списка целых чисел ( TList ), строк ( TList ) и т.д.

В любом случае, эти преимущества достаточно существенны для того, чтобы пользоваться ими в полной мере.

3. Встроенные обощенные классы в Delphi
«Из коробки» в Delphi уже имеется ряд стандартных обобщенных классов, которые можно использовать при написании приложений. Находятся они в модулях Generics.Defaults и Generics.Collections . Основные классы и типы данных приведены в Таблицах 1 и 2.
Таблица 1 — Некоторые классы модуля Generics.Defaults

IComparer Обобщенный интерфейс IComparer предназначен для сравнения двух значений одинакового типа
IEqualityComparer Обобщенный интерфейс IEqualityComparer используется для проверки равенства двух значений
TComparer Базовый обощенный класс для классов, реализующих интерфейс IComparer
TEqualityComparer Базовый обощенный класс для классов, реализующих интерфейс IEqualityComparer
TCustomComparer Базовый обощенный класс для классов, реализующих интерфейсы IComparer и IEqualityComparer

Таблица 2 — Некоторые классы и типы модуля Generics.Collections

Классы
TArray Класс, содержащий статические методы для поиска и сортировки обобщенного массива
TDictionary ,
TObjectDictionary
Словарь (коллекция пар ключ-значение)
TList ,
TObjectList
Упорядоченный список
TStack ,
TObjectStack
Реализация стека (последний пришел, первый вышел)
TQueue ,
TObjectQueue
Реализация очереди (первый пришел, первый вышел)
Типы
TPair Запись, хранящая пару ключ-значение
Примечание: как и аналоги из модуля Classes , обощенные «объектные» классы относительно «необъектных» (например, TObjectList по сравнению с TList ) позволяют хранить объекты в качестве своих элементов, а также автоматически следить за их жизненным циклом

Использовать стандартные обобщенные классы довольно просто: включаем соответствующие модули в раздел uses и задействуем нужные нам классы. В Листинге 6 приведен пример работы со списком целых чисел на основе обобщенного класса TList .
Листинг 6 — Пример использования TList для создания списка целых чисел
Более подробно системные классы будут рассмотрены во 2-м разделе.

4. Что «поддается обобщению» в Delphi?
Естественно, что в Delphi имеется возможность не только использовать имеющуюся библиотеку дженериков, но и создавать свои собственные. Обобщенными могут быть классы, интерфейсы и записи. Также поддерживается создание обобщенных методов (процедур и функций).
4.1. Обобщенные методы
Самым простым примером обобщенного метода может служить процедура для обмена значений переменных:
Листинг 7 — Пример дженериковой процедуры Swap
Использовать такую процедуру можно следующим образом:
Листинг 8 — Использование дженериковой процедуры Swap
Результат приведен на Рисунке 1:

Рисунок 1 — Пример использования Swap

4.2. Обобщенные классы
Приведем пример обобщенного класса массива:
Листинг 9 — Пример обобщенного класса массива TGenericArray
Посмотрим на вариант его использования:
Листинг 10 — Использование обобщенного класса массива TGenericArray
Результат приведен на Рисунке 2.

Рисунок 2 — Пример использования TGenericArray
4.3. Обобщенные записи
Пример обобщенной записи TPoint уже был приведен в начале раздела. Гляньте на нее еще разок.

Преобразование Delphi TPoint в C # Поинт

Я пытаюсь преобразовать некоторый Delphi код, как мы переписывание 6.0 приложения Delphi (VCL) в .Net. Я не уверен, и не мог понять, сравнение между 2 Delphi Tpoints (х, у) с тем, что в C # точки (х, у).

Я пытаюсь нарисовать линию между 2 точками, но так как я понятия не имею, как Delphi втягивает его, я не в состоянии установить C # координаты для него.

Код Delphi прост:

Я знаю о C # координаты, хотя около 72 точек на дюйм и нужно вычислить плотность пикселей. Но я не уверен в Delphi PPI.

Любые будут оценены. Благодарю.

Edit: Если кто — то интересно , что TPoint я имею в виду , когда нет ни в моем фрагменте кода, Canvas.MoveTo устанавливает PenPos свойство холста , который имеет тип TPoint .

Я не уверен , что точный вопрос , который спрашивают здесь. У вас нет Delphi TPoint в вашем фрагменте кода; вы просто клиент Rect логические координаты.

Происхождение находится X = 0, Y = 0 , что верхний левый угол клиентской области. Увеличение X перемещает позицию вправо, и увеличение Y перемещает позицию вниз. Логические блоки пиксели, поэтому , начиная с происхождением 0, 0 , Canvas.MoveTo(10, 10) бы установить новую позицию рисования с левым краем 10 пикселей и вниз от верхних 10 пикселей, а Canvas.LineTo(20, 20) оттуда бы провести линию от точки , в 10, 10 к 20, 20 .

TCanvas.MoveTo и TCanvas.LineTo просто обертки вокруг основных функций Windows GDI MoveToEx (с всегда NULL третьим параметром) и LineTo .

Что касается C # эквивалент, если вы имеете в виду System.Drawing.Point , блоки , используемые в точности то же самое (хотя я не уверен , где происхождение основано по умолчанию). Принимая во внимание происхождение 0, 0 , System.Drawing.Point(10, 10) должно быть такое же положение описано выше — 10 пикселей от левого края и 10 пикселей вниз от верхнего края.

Быстрая проверка подтверждает, что происхождение в приложении WinForms, на самом деле верхний левый угол клиентской области, используя:

TPoint — Тип Delphi

Есть функция:
function DisplayContextMenu(const Directory: string; Items: TStringList;
Parent: DFS_HWND; Pos: TPoint; ShowRename: boolean;
var RenameSelected: boolean): boolean; overload;
Мне нужно эту функцию вызвать с параметрами
DisplayContextMenu(directory,fselectedfiles, handle, p,False,b);
где
directory:string,
p: Tpoint;
b: boolean;
fselectedfiles:TStringList

а вот handle объявлен у меня как тип Thandle

Проблема в том, что я не могу преобразовать Thandle handle к типу Tpoint

В функциях API порылся и нашёл тока функцию обратного преобразования:
HWND WindowFromPoint ( POINT Point );

Как преобразовать тип Thandle к типу Tpoint?

IS TPoint a primitive in Delphi?

I am creating several TPoint objects at runtime but I am not destroying them.

I checked the code of TPoint in System.Types :

By reading it i see there is no destructor and moreover it is a record of primitives. I do not master Delphi at a point to be sure about it, but i think it is not needed to call MyPoint.Free . May some expert confirm?

2 Answers 2

TPoint is a record.

value types
Record and basic types like integer are value types.
This means that they are created on the stack.
When a function exits it cleans up the stack and by so doing reclaims the memory space.

reference types
This is in contrast to classes which are reference types.
A class is created on the heap and needs to be explicitly freed.

managed types
Right in between these two extremes sit managed types like string or interface .
These are created on the heap, but the compiler uses compiler magic to automatically destroy them when their reference count drops to zero. Because of this managed types are said to have value sementics .

ARC
On ARC compilers (mobile + Linux) even classes are automatically managed using reference counting. This means that the semantic differences between records, classes and managed types have been eliminated.

You can of course create a record on the heap if you want:

Remember to always pass records as const parameters (if possible). Otherwise the compiler will waste time making a copy of the record.

But. methods?
Records have methods these days.
However this is merely syntactic sugar.
You cannot have virtual/dynamic methods and you cannot have interfaced methods

The following two methods are exactly equivalent:

Delphi: Compare two points (TPoint) — Are two points >Tip by Delphian | 20/04/2013 at 17:24

Question: We would like to compare two points of the type TPoint, that is, we would like to check whether the two points have the same coordinates.

Problem: Unfortunately, it is not possible to compare the points using P1 = P2 or P1 <> P2, because a TPoint is a record having stored several data parts (X and Y value). Such complex data types can not simply compared with = or <>.

Solution: Delphi as well as Lazarus are providing the function PointsEqual for this problem. Passing two points as parameters, PointsEqual returns true or false depending on whether the points are equal or not.

Example: This is illustrated in the following example.

We define three points P1, P2 and P3, where P1 and P3 are identical. Afterwards, we are checking the points on equality using PointsEqual and we will get the message «P1 and P2 are identical».

About the Author

The author has not added a short description to his profile yet.
Show Profile | Message

Илон Маск рекомендует:  Что такое код getlastactivepopup

Преобразование Delphi TPoint в точку С#

Я пытаюсь преобразовать некоторый код Delphi, поскольку мы переписываем приложение Delphi 6.0 (VCL) в.Net. Я не уверен и не мог понять сравнение между двумя Delphi Tpoints (x, y) и сравнением с С# Point (x, y).

Я пытаюсь сделать линию между 2 очками, но поскольку я понятия не имею, как Delphi ее рисует, я не могу установить для нее С# -координаты.

Код Delphi прост:

Я знаю о координатах С#, хотя около 72 точек на дюйм, и вам нужно вычислить плотность пикселей. Но я не уверен в Delphi PPI.

Любой был бы оценен. Благодарю.

Изменение: если кто-то задается вопросом, что TPoint я говорю, когда в моем фрагменте кода нет ни одного, Canvas.MoveTo задает свойство PenPos холста, имеющего тип TPoint.

Я не уверен, какой именно вопрос задается здесь. У вас нет Delphi TPoint в фрагменте кода; у вас просто есть клиентские логические координаты.

Начало координат равно X = 0, Y = 0 , что является верхним левым углом клиентской области. Увеличение X перемещает позицию вправо, и увеличение Y сдвигает позицию вниз. Логические единицы — это пиксели, поэтому, начиная с начала 0, 0 , Canvas.MoveTo(10, 10) установит новую позицию рисования в левом краю 10 пикселей и вниз от 10 лучших пикселей, а Canvas.LineTo(20, 20) оттуда будет выводить линию из точки 10, 10 20, 20 .

TCanvas.MoveTo и TCanvas.LineTo — это просто обертки вокруг базовых функций Windows GDI MoveToEx (с постоянным третьим параметром NULL ) и LineTo .

Что касается эквивалента С#, если вы имеете в виду System.Drawing.Point , используемые единицы точно такие же (хотя я не уверен, где исходное значение основано по умолчанию). Учитывая начало 0, 0 , System.Drawing.Point(10, 10) должно быть одинаковую позицию, описанную выше, — 10 пикселей от левого края и 10 пикселей вниз от верхнего края.

Быстрая проверка подтверждает, что начало в приложении WinForms на самом деле является верхним левым углом клиентской области, используя:

Delphi: Compare two points (TPoint) — Are two points >Tip by Delphian | 20/04/2013 at 17:24

Question: We would like to compare two points of the type TPoint, that is, we would like to check whether the two points have the same coordinates.

Problem: Unfortunately, it is not possible to compare the points using P1 = P2 or P1 <> P2, because a TPoint is a record having stored several data parts (X and Y value). Such complex data types can not simply compared with = or <>.

Solution: Delphi as well as Lazarus are providing the function PointsEqual for this problem. Passing two points as parameters, PointsEqual returns true or false depending on whether the points are equal or not.

Example: This is illustrated in the following example.

We define three points P1, P2 and P3, where P1 and P3 are identical. Afterwards, we are checking the points on equality using PointsEqual and we will get the message «P1 and P2 are identical».

About the Author

The author has not added a short description to his profile yet.
Show Profile | Message

TPoint — Тип Delphi

Есть функция:
function DisplayContextMenu(const Directory: string; Items: TStringList;
Parent: DFS_HWND; Pos: TPoint; ShowRename: boolean;
var RenameSelected: boolean): boolean; overload;
Мне нужно эту функцию вызвать с параметрами
DisplayContextMenu(directory,fselectedfiles, handle, p,False,b);
где
directory:string,
p: Tpoint;
b: boolean;
fselectedfiles:TStringList

а вот handle объявлен у меня как тип Thandle

Проблема в том, что я не могу преобразовать Thandle handle к типу Tpoint

В функциях API порылся и нашёл тока функцию обратного преобразования:
HWND WindowFromPoint ( POINT Point );

Как преобразовать тип Thandle к типу Tpoint?

Floating point numbers — Sand or dirt

Floating point numbers are like piles of sand; every time you move them around, you lose a little sand and pick up a little dirt. — Brian Kernighan and P.J. Plauger

Real numbers are a very important part of real life and of programming too. Almost every computer language has data types for them. Most of the time, they come in the form of (binary) floating point datatypes, since those are directly supported by most processors. But these computerized representations of real numbers are often badly understood. This can lead to bad assumptions, mistakes and errors as well as reports like: «The compiler has a bug, this always shows ‘not equal’«

Experienced floating point users will know that this can be expected, but many people using floating point numbers use them rather naïvely, and they don’t really know how they “work”, what their limitations are, and why certain errors are likely to happen or how they can be avoided. Anyone using them should know a little bit about them. This article explains them from my point of view, i.e. facts I found out the hard way. It may be slightly inaccurate, and probably incomplete, but it should help in understanding floating point, its uses and its limitations. It does not use any complicated formulas or higher scientific explanations.

Floating point types in Delphi

Floating point is the internal format in which “real” numbers, like 0.0745 or 3.141592 are stored. Unlike fixed point representations, which are simply integers scaled by a fixed amount — an example is Delphi’s Currency type — they can represent very large and very tiny values in the same format. While Delphi knows several types with differing precision, the principles behind them are (almost) the same. The types Single , Double and Extended are supported by the hardware (by the FPU — floating point unit) of most current computers and follow the IEEE 754 binary format specs. The type Real , which is a relict of old Pascal, now maps to Double by default, but, if you set <$REALCOMPATIBILITY ON>, it maps to Real48 type, which is not an IEEE type and used to be managed by the runtime system, that is, in software, and not by hardware. There is also a Comp type, but this is in fact not a floating point type, it is an Int64 which is supported and calculated by the FPU.

The Real48 type is pretty obsolete, and should only be used if it is absolutely necessary, e.g. to read in files that contain them. Even then it is probably best to convert them to, say, Double , store those in a new file and discard the old file.

While Real types used to be managed in software, for computers that did not have an FPU (which was not uncommon in the earlier days of Turbo Pascal), this is not the case for current systems, which have an FPU. The runtime converts Real48 to Extended , uses that to do the required calculations and then converts the result back to Real48 . This constant conversion makes the type pretty slow, so you should really, really avoid it.

Note that the above does not apply to Real , if it is mapped to Double , which is the default setting. It only applies to the 6-byte Real48 type.

Real numbers

Some developers, when encountering a problem, say: “I know, I’ll use floating-point numbers !” Now, they have 1.9999999997 problems. — unknown

The real-number system is a continuum containing real values from minus infinity (−∞) to plus infinity(+∞). But in a computer, where they are only represented in a very limited amount of bytes ( Extended , the largest floating point type in Delphi, has no more than 80 bits and the smallest, Single , only 32 !), you can only store a limited amount of discrete values, so it is not nearly a continuum. Most real numbers can only (roughly) be approximated by floating point types. Everyone using them should always be aware of this.

There are several ways in which real numbers can be represented. In written form, the usual way is to represent them as a string of digits, and the decimal point is represented by a ‘.’, e.g. 12345.678 or 0.0000345 . Another way is to use scientific notation, which means that the number is scaled by powers of 10 to, usually, a number between 1 and 10, e.g. 12345.678 is represented as 1.2345678 × 10 4 or, in short form (the one Delphi uses), as 1.2345678e4 .

Internal representation

The way such «real» numbers are represented internally differs a bit from the written notation. The fixed point type Currency is simply stored as a 64 bit integer, but by convention its decimal point is sa >10000 to get the value it is supposed to represent. So the number 3.76 is internally stored as 37600 . The type was meant to be used for currencies, but that the type only has 4 decimals means that calculations other than addition or subtraction can cause inaccuracies that are often not tolerable.

The floating point types used in Delphi have an internal representation that is much more like scientific notation. There is an unsigned integer (its size in bit depends on the type) that represents the digits of the number, the mantissa, and a number that represents the scale, in our case in powers of 2 instead of 10 , the exponent. There is also a separate sign bit, which is 1 if the number is negative. So in floating point, a number can be represented as:

value = (−1) sign × (mantissa / 2 len−1 ) × 2 exp

where sign is the value of the sign bit, mantissa is the mantissa as unsigned integer (more about this later), len is the length of the mantissa in bits, and exp is the exponent.

Mantissa

The mantissa (The IEEE calls it “significand”, but this is a neologism which means something like “which is to be signified”, and in my opinion, that doesn’t make any sense) can be viewed in two ways. Let’s disregard the exponent for the moment, and assume that its value is thus that the number 1.75 is represented by the mantissa. Many texts will tell you that the implicit binary point is viewed to be directly right of the topmost bit of the mantissa, i.e. that the topmost bit represents 2 0 , the one below that 2 −1 , etc., so a mantissa of binary 1.1100 0000 0000 000 represents 1.0 + 0.5 + 0.25 = 1.75 .

Other, but not so many texts, simply treat the mantissa as an unsigned integer, scaled by 2 len−1 , where len is the size of the mantissa in bits. In other words, a mantissa of 1110 0000 0000 0000 binary or 57344 in decimal is scaled by 2 15 = 32768 to give you 57344 / 32768 = 1.75 too. As you see, it doesn’t really matter how you approach it, the result is the same.

Exponent

The exponent is the power of 2 by which the mantissa must be multiplied to get the number that is represented. Internally, the exponent is often “biased”, i.e. it is not stored as a signed number, it is stored as unsigned, and the extremes often have special meanings for the number. This means that, to get the actual value of the exponent, you must subtract a constant value from the stored exponent. For instance, the bias for Single is 127 . The value of the bias depends on the size of the exponent in bits and is chosen thus, that the smallest normalized value (more about that later) can be reciprocated without overflow.

There are also floating point systems that have a decimal based exponent, i.e. where the value of the exponent represents powers of 10. Examples are the Decimal type used in certain databases and the — slightly incompatible — Decimal type used in Microsoft .NET. The latter uses a 96 bit integer to represent the digits, 1 bit to represent the sign (+ or −) and 5 bits to represent a negative power of 10 ( 0 up to 28 ). The number 123.45678 is represented as 12345678 × 10 −5 . I have written an almost exact native copy of the Decimal type to be used by Delphi. It is a little faster than the original .NET type, but not nearly as fast as the hardware supported types.

This article mainly discusses the floating point types used in Delphi, to know Single , Double and Extended , which are all floating binary point types. Floating decimal point types like Decimal are not supported by the hardware or by Delphi. So if, in this article, I speak of «floating point» I mean the floating binary point types.

Sign bit

The sign bit is quite simple. If the bit is 1 , the number is negative, otherwise it is positive. It is totally independent of the mantissa, so there is no need for a two’s complement representation for negative numbers. Zero has a special representation, and you can actually even have −0 and +0 values.

Normalization and the hidden bit

I’ll try to explain normalization and denormals with normal scientific notation first.

Take the values 6.123 × 10 −22 , 612.3 × 10 −24 and 61.23 × 10 −23 (or 6.123e-22 , 612.3e-24 and 61.23e-23 respectively). They all denote the same value, but they have a different representation. To avoid this, let’s make a rule that there can only be one (non-zero) digit to the left of the decimal point. This is called normalization. Something similar is done with binary floating point too. Since this is binary, there is only one digit left: 1. So there can only be one (non-zero) digit (always 1) to the left of the binary point. Since this is always the same digit, it does not have to be stored, it can be implied. This is the so-called hidden bit. The types Single and Double do not store that bit, but assume it is there in calculations.

Илон Маск рекомендует:  blockquote в HTML

How is this done in binary? Let’s take the number 0.375 . This can be calculated as 2 −2 + 2 −3 (0.25 + 0.125) , or, in a mantissa, 0.011…bin (disregarding the trailing zeroes), i.e. 0.375 × 2 0 . But this is not how floating point numbers are usually stored. The exponent is adjusted thus, that the mantissa always has its top bit set, except for some special numbers, like 0 or the so called “tiny” (denormalized) values. So the mantissa becomes 1.100…bin and the exponent is decremented by 2 . This number still represents the value 0.375 , but now as 1.5 × 2 −2 . This is how normalization works for binary floating point. It ensures that 1.0 . Because of the h >“1.bin” in front of the stored bits of the mantissa.

Note that in e.g. the language C, the value FLT_MIN stands for the smallest (positive) normalized value. You can have values smaller than that, but they will be denormal values, i.e. with a lower precision than 24 bits.

There is some confusion about how to denote the size (or length, as it is often called) of the mantissa of a type with a hidden bit. Some will use the actually stored length in bits, while others also count the hidden bit. For instance, a Single has 23 bits of storage reserved for the mantissa. Some will call the length of the mantissa 23 , while others will count the h >24 .

Denormalized values

With real numbers, to get close to zero, you can simply use a very negative exponent, e.g. 6.123 × 10 −99999 . But in floating point, the exponent is limited and can not go below a certain value. The mantissa is limited too. Let’s assume that in our scientific notation, the exponent can not go below −100 and the mantissa can only have 4 digits. Then a very small normalized value would be 6.123 × 10 −100 . To denote even smaller values, you have to resort to denormals: you drop the rule that the first digit must always be non-zero. Now you can also have values smaller than the smallest normalized (positive) value of 1.000 × 10 −100 , like 0.612 × 10 −100 , 0.061 × 10 −100 and 0.006× 10 −100 . This also means that the lower you go, you lose precision: fewer and fewer significant digits are available.

This is similar for binary floating point, except that the digits are just 0 or 1. Sometimes, after an operation, the exponent can not be decremented far enough to represent the result. In that case, the exponent is set to a special value, and the mantissa is not normalized anymore, i.e. the top bit is not 1 and the mantissa is interpreted as something like 0.000xxx…xxxbin , i.e. it has one or more leading zeroes followed by as many significant bits as will fit. Such values are called denormalized or tiny values. Because of the leading zeroes, not every bit is significant anymore, so the precision is lower than for normalized values of the same type.

Other special values

The most obvious special value is 0. Because 0 is denoted by both exponent and mantissa having all zero bits, there are actually two representations of 0, one with the sign bit cleared, and one with the sign bit set (i.e. +0 and −0). In any calculation, these are considered equal and simply represent 0.

Not every bit combination represents a number. Some represent +/− infinity, and some are invalid. The latter are called NaN — Not a Number. The rules for which bit combinations represent what are described in the Delphi help, and in the Delphi DocWiki: Single, Double and Extended. I will not repeat that information here. But the Math unit contains a few constants and functions that can help you check or assign some of these values:

IEEE types

The IEEE types used in Delphi are

Type Mantissa bits Exponent bits Sign bit Smallest value Biggest value Exponent Bias
Single 0-22 23-30 31 1.5 × 10 −45 3.4 × 10 +38 127
Double 0-51 52-62 63 5.0 × 10 −324 1.7 × 10 +308 1023
Extended 0-63 64-78 79 3.4 × 10 −4951 1.1 × 10 +4932 16383 no hidden bit

The following diagram shows a simple representation of these types:

Due to how floating point is implemented on Win64 (using SSE instead of the x87 FPU), there is no 80 -bit floating point type in the 64 -bit compiler. That is why, on Win64, Delphi’s Extended is aliased to the 64 -bit type Double .

Using floating point numbers

Nothing brings fear to my heart more than a floating point number. — Gerald Jay Sussman

In the following, I am using the terms small and large. I mean values that have a very low or a very high exponent, respectively, regardless of their sign. That means that small values are very close to 0 , while large values are far away from 0 . In other words, I am addressing their magnitude, not their signs.

As you can see in the diagram, the different types have quite a different precision. Internally, for calculations, Delphi always uses Extended . Literals, like 0.1 are also stored as Extended . That is why the little code snippet at the beginning of this article produced False , since it was converted from Extended to Single , losing a few bits of precision, and for the comparison, it was converted back to Extended . The loss of precision caused the difference, so the result of the comparison was False .

There are many such traps, caused by the limitations of how the infinite range of real numbers must be represented in a finite number of bits. Some of these traps are discussed in the following paragraphs.

Rounding

After calculations, e.g. multiplications or additions, the result can contain more significant bits than the type can hold, so the FPU must round the values to make them fit and normalized again, which means that a number of bits gets lost. How this rounding is done is governed by IEEE rules. But this means that there will be additional tiny inaccuracies. An example:

As you can see, the closest possible representation for 0.1 in a Single is 0.10000000149011612 . If this is squared and then rounded, you get 0.01000000070780516 , but the closest representation for 0.01 is 0.00999999977648258 . So, in other words, Single(0.1) * Single(0.1) <> Single(0.01) .

Doing multiple calculations like this will slowly add up the errors, and they do not necessarily even each other out. It is very important that you take such errors into consideration and do no more calculations than necessary. It is always a good idea to simplify your expressions and to use professional libraries that know how to avoid too many calculations for the purpose. As in so many programming problems, the choice of algorithm and of the used types is also very important.

Rounding modes and tie-breaking rules

Rounding is generally done to the nearest more significant digit available. But sometimes there is a tie, if the value to be rounded is exactly between the two nearest digits. In that case, a tie-breaking rule is required and one very common rule is called banker’s rounding (although banks are not known to use or having used it), which says that a tie is rounded to the nearest even more significant digit. This means that 24.05 is rounded to 24.0 , but 24.15 to 24.2 .

Other commonly used tie-breaking rules are:

  • Truncating (towards 0 ) — This means that 24.05 is rounded to 24.0 , and −24.05 to −24.0 . In fact, the less significant digits are simply dropped.
  • Rounding up (towards +∞) — This means that 24.05 is rounded to 24.1 , but −24.05 to −24.0 . This mode is taught in many schools.
  • Rounding down (towards −∞) — This means that 24.05 is rounded to 24.0 , and −24.05 to −24.1 .
  • Rounding away from 0 — This means that 24.05 is rounded to 24.1 , and −24.05 to −24.1 . This mode is taught in many schools too, but is not an IEEE approved method.

Note that there are other rounding modes that do not round to the nearest more significant digit, but round to the more significant digit that is either above (closer to +∞), below (closer to −∞) or closer to 0 .

RoundTo and SimpleRoundTo

Unit math contains a few nice functions to round a floating point value ( Extended ) to a set number of digits:

RoundTo is probably a little more accurate and faster, but SimpleRoundTo allows a bigger range of digits and uses a slightly different rounding algorithm.

For better decimal rounding than these rather simple approaches, take a look at John Herbster’s DecimalRounding_JH1 unit on Embarcadero’s CodeCentral. It uses a more sophisticated algorithm which produces better results. It implements all the rounding modes I discussed in the Rounding modes and tie-breaking rules section above.

The x87 Floating Point Unit

The x87 FPU knows 4 rounding modes (see the FPU control word section of this article). So how does the FPU round? Say an operation on a Single produced an intermediate result that has some extra low bits. The extended mantissa looks like this:

1.0001 1100 0100 1100 1001 0111

The underlined bit is the bit to be rounded. There are two possible values this can be rounded to, the value directly below and the value directly above:

1.0001 1100 0100 1100 1001 011
1.0001 1100 0100 1100 1001 100

Now what happens depends on the rounding mode. If the rounding mode is the default — round to nearest “even” — it will get rounded to the value that has a 0 as least significant bit. You can probably guess which of the two values is chosen for the other rounding modes.

Measuring rounding errors

There are ways to measure accumulated rounding errors. The most common methods used are ULP and relative or approximation error. Discussing them is outside the realm of this article, so I have to refer you to Wikipedia and the articles mentioned in the References section of this article.

Conversion

It is never a good idea to write code that requires a lot of conversions, for instance code that must convert between several floating point types, since each conversion, especially to a less precise type, can mean the loss of a few bits and therefore increases the inaccuracy. If space or speed are not as important as accuracy, use the Extended type throughout, because Delphi also uses it internally in most system functions. An example:

Literals

In source code, we use decimal numbers. But floating point types are stored as binary. For integers, this is not a big problem, but as soon as fractions are involved, there is one. Not every number that can be represented exactly in decimal can be represented exactly in binary, just like certain numbers, e.g. 1/3 or π can not be represented exactly in decimal format. In binary, only numbers that are sums of powers of 2 can be represented exactly in a binary floating point type (e.g. 3.625 = 2 + 1 + 0.5 + 0.125 ). A number like 0.1 can not be composed of such powers. The compiler will try to get the best approximation that is possible, but there will always be a small difference.

Comparing values

The above shows that it is never a good idea to compare floating point values directly. Conversions and rounding cause tiny inaccuracies. These errors can add up, the more calculations you do.

To accomodate for these inaccuraries, it is a good idea to always use a small error value in comparisons. In Delphi’s Math unit, there are a number of of overloaded functions that can help you do that:

An ε (epsilon) value is a small value you can use as an error range. These functions either take an ε you prov >0 , they will calculate an ε that takes the magnitude of the operands you are comparing into consideration. So it is usually best only to pass the operands, and not a specific ε, unless you have a really good reason to force one upon the function. An example follows:

Илон Маск рекомендует:  Эффективный способ применения интерфейсов в mdi приложениях

That is because S1 has the value 0.300000011920928955078125 , while the calculation resulted in S2 ,

  • which started out as 0.100000001490116119384765625 ,
  • then, after the division, became 0.00999999977648258209228515625
  • and after the multiplication 0.0999999940395355224609375 .
  • Adding this value twice more resulted in 0.2999999821186065673828125 .

(Exact values extracted using my ExactFloatString unit)

SameValue accounts for the little differences, while = compares for exact equality.

Catastrophic cancellation

If almost equal values are subtracted (or two values with differing sign but otherwise almost equal values are added), the result is a value that is tiny, compared to the values. This tiny value can well be in the range of the roundoff errors mentioned before, so it can’t be trusted. It is another situation you should avoid. An example follows:

The output from Delphi 2010 is:

One would expect the difference to be 1.0 × 10 −18 , but the value you get is 1.735 × 10 −18 .

Also note that the output doesn’t display the decimal 1 in E1 , which shows you can’t always trust the accuracy of your output either.

This is an example of catastrophic cancellation: a devastating loss of precision when small numbers are computed from large numbers, which themselves are subject to roundoff error.

Greatly differing magnitude

This is more or less the reverse of catastrophic cancellation.

If two values differ greatly in magnitude, the smaller of the two might be below the precision of the larger one. So adding the tiny value to such a huge value (or subtracting it) will have no effect. That means that you should take care in which order you do such additions or subtractions. Take the following simple example:

The results shown are:

The first result is what you would expect, but the second one is the result of the fact that S3 got swallowed by the precision of the large value in S2 , so here, S2 + S3 = S2 . In mathematics, addition is associative

But the addition of floating point values is not associative, so

Note that if you have many values to add, it makes sense to sort them in order of magnitude. A nice explanation is given to this StackOverflow question, by Steve Jessop. Be sure to read the comments too.

It comes down to the fact that if you add a tiny number to a big one, the tiny one may not change the big one, but if you add a lot of tiny ones first, they may accumulate to a value that can make a difference by being closer to the big one. The link gives some examples.

Also note the answer recommending the Kahan summation algorithm, by Daniel Pryden. Kahan’s algorithm sums the rounding errors in an extra floating point variable and uses that to get a more correct answer.

Functions requiring real values

Not only are fractions like 0.1 not representable in binary floating point, there are also values that are not representable in any integral number base, like the irrational numbers π or Euler’s constant e, but also values like √2. Functions based on numbers like these are bound to be inaccurate, especially in a limited format like floating point and because they require multiple internal calculations, even if these are probably with greater precision. That is why functions calls like sin(π) do not deliver exact results. For Sin(Pi) , Delphi returns −5.42101086242752 × 10 −20 , instead of the expected 0 .

Avoiding the traps

There are a few tips to avoid the many traps.

  • Never forget that Delphi’s floating point types store in binary, and that often can’t represent decimal values accurately.
  • Choose the right precision for your application.
  • Do not mix several types of floating point.
  • Be aware of rounding errors and that they can add up.
  • Optimize and simplify your algorithms to avoid too many calculations.
  • Use professional libraries instead of cooking your own ones.
  • Do not add or subtract values of greatly differing magnitude (be aware of the risk of catastrophic cancellation).
  • Do not compare values directly, but use library functions like SameValue .

Internals

So how does this look internally? In the following example I use a Single, because Singles have a readily comprehensible number of bits in the mantissa and exponent. Let me show you how a number like 0.1 is stored in a Single.

After ordering the bits, this is:

  • the sign bit is 0 ,
  • the exponent is 123 − 127 = −4 and
  • the mantissa is (incl. h >1100 1100 1100 1100 1100 1101 or $CCCCCD or 13421773 .

If 13421773 is multiplied with 2 −4 (0.0625) , the result is 838860,8125 . After scaling that by 2 23 (8388608) , this becomes 0.100000001490116119384765625 , which is indeed pretty close to 0.1 . The following table shows that this is indeed the closest value, by also calculating the values with one ULP difference, i.e. with mantissas $CCCCCC and $CCCCCE respectively.

Hex Value Difference with 0.1 (abs)
$3DCCCCCC 0.0999999940395355224609375 0.00000000 5 9604644775390625
$3DCCCCCD 0.100000001490116119384765625 0.00000000 1 490116119384765625
$3DCCCCCE 0.10000000894069671630859375 0.00000000 8 94069671630859375

To see how the conversion from text to binary is done (well, more or less), take a look at this StackOverflow answer of mine.

In reality, the functions that convert from a string in decimal format to binary floating point are very complicated. The de facto standard C implementation, strtod , by David M. Gay, uses several different algorithms, depending on the value. One of these algorithms even requires a simple implementation of an unlimited precision BigInteger. So it is, in many cases, not nearly as simple as in my Stack Overflow answer mentioned above.

My BigDecimal implementation can do extemely accurate conversions between decimal format strings like ‘1.34567e-138’ and binary floating point types too (both ways), using BigDecimal as an intermediate representation.

The FPU control word

The FPU control word is a word-size set of bits that control the behaviour of the FPU. The bits are set up as follows

Bits Name Values Description
Exception flag masks
IM 1 Invalid operation
1 DM 1 Denormalized operand
2 ZM 1 Zero divide
3 OM 1 Overflow
4 UM 1 Underflow
5 PM 1 Precision
Precision bits
8 , 9 PC 00 Single precision (24 bit mantissa)
10 Double precision (53 bit mantissa)
11 Extended precision (64 bit mantissa)
01 reserved
Rounding mode
10 , 11 RM 00 Round to nearest even (banker’s rounding)
01 Round down toward infinity
10 Round up toward infinity
11 Round toward zero (trunc)
Infinity control
12 X Used for compatibility with 287 FPU
Projective
1 Affine
6 , 7 , 13 , 14 , 15 reserved and not used.
$133F turns off all exceptions

In Delphi, to control the FPU control word (in Delphi, it is called 8087CW), there are a few functions, mentioned in the help and the DocWiki entry for the FPU Control Word. An example of their use:

There is no exception, since including exZeroDivide will mask the division by zero FPU exception, this means that div >+∞ instead.

Investigating floating point types

If you want to investigate or (ab)use the internal formats of the floating point types a little more, you should look for the routines by John Herbster, former member of TeamB. Most of them can be found on Embarcadero’s CodeCentral.

But also take a look at my (new) ExactFloatStrings unit, which is included with my BigIntegers unit. It works in Delphi XE3 up to Delphi 10 Seattle. I am working on making it work in Delphi XE2 too.

Also note that from Delphi XE3 on, you can use the helpers for floating point types, by including the System.SysUtils unit in your uses clause. For instance, to access the mantissa and exponent of a Double, you can do something like:

Basic conversions of floating point values…

… to their composite parts

There are a few basic functions that can be useful to examine the composing parts of a floating point value:

Function Unit (old name) Output
Int System.Math (Math) Returns the integral part (i.e. the part before the decimal point) of a floating point value as Extended .
Frac System Returns the fractional part (i.e. the part after the decimal point) of a floating point value as Extended .
Sign System.Math (Math) Returns the sign of a number value as TValueSign .
Frexp System.Math (Math) Procedure that returns the mantissa and the exponent of a Single , Double or Extended value as the same type and an Integer , respectively.
FloatToDecimal System.SysUtils (SysUtils) Procedure that returns the composing parts of a floating point value in a TFloatRec as data that can be used for formatting.

… to integers

To convert floating point values to Integers, there are a few system functions which each convert their numbers a little differently.

Function Unit (old name) Output
Trunc System Rounds a floating point value to the Int64 value nearest to zero (i.e. it truncates toward 0 ).
Round System Rounds a floating point value to the nearest Int64 value, or when it is exactly halfway, uses “Banker’s rounding”.
Floor System.Math (Math) Rounds a floating point value to the highest Int64 value that is less than or equal to it (i.e., it truncates toward −∞).
Ceil System.Math (Math) Rounds a floating point value to the lowest Int64 value that is greater than or equal to it (i.e., it truncates toward +∞).

These functions generally issue an EInvalidOp exception if the result would be outside the Int64 range.

… to and from text

To display a floating point number, the runtime must convert them from binary back to decimal. Also here, inaccuracies can creep in. It is also important what kind of format you choose. The specific output may depend on the format settings for the current locale, too.

The runtime library, especially the System.SysUtils (SysUtils) unit, provides you with some convenient functions to format such numbers, like Format , FormatFloat , FloatToStrF , FloatToText and FloatToTextFmt . Take a look at FloatToDecimal as well.

The other way around, conversion from text to floating point, has some limitations. In Win32, a routine like StrToFloat internally uses Extended , so any Double or Single values resulting from this will be accurate. Unfortunately, this is not true in Win64. In Win64, the result of StrToFloat for large values (e.g. -1.79e30 ) can be off by one (lowest) bit, because internally, it uses Double for the conversion, and somehow the rounding seems to be slightly inaccurate. For most practical purposes, this is not really a problem, but in some cases it can be.

Note that in C++Builder, a routine like strtod() is even slightly more inaccurate. I have found differences of two lowest bits.

Conclusion

Floating point types are useful, but one must be aware of their limitations. I hope this article helped you understand them a little better. But there are certainly things I forgot to mention, or which are incorrect. I am grateful for any constructive remark, criticism, objection, etc. You can contact me by e-mail to tell me what you think of this.

References and further reading

  • Floating Point — Robert Sedgewick and Kevin Wayne
  • Floating Point Arithmetic: Issues and Limitations — Python Software Foundation
  • The Perils of Floating Point — Bruce M. Bush
  • What Every Computer Scientist Should Know About Floating-Point Arithmetic — David Goldberg
    The publication that is regarded by many as the standard reference on floating point.
  • The trouble with rounding floating point numbers — Dan Clarke
  • Floating point numbers – what else can be done? — Dan Clarke
  • What is Floating Point? — WiseGeek
  • Gleitkommazahl — (German wikipedia)
  • Exploring Binary — Rick Regan
    Great site with lots of information about floating point (conversions) and other things.
  • Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 — Intel®
  • FPU Control Word Bits — Tony Costanza, Earl F. Glynn
  • Stack Overflow — A very good resource for questions around the topic.

These links are being provided as a convenience and for informational purposes only; they do not constitute an endorsement or an approval of any of the products, services or opinions of the corporation or organization or individual. I bear no responsibility for the accuracy, legality or content of the external site or for that of subsequent links. Contact the external site for answers to questions regarding its content.

The coding examples presented here are for illustration purposes only. The author takes no responsibility for end-user use. All content herein is copyrighted by Rudy Velthuis, and may not be reproduced in any form without the author’s permission. Source code written by Rudy Velthuis presented as download is subject to the license in the files.

Понравилась статья? Поделиться с друзьями:
Кодинг, CSS и SQL