doc/classfile-spec.md
This document defines the syntax and semantics of XGo classfiles.
A classfile is a source file that is compiled as a generated class type plus a set of generated methods and helper glue.
There are two kinds of classfiles:
.gox fileThe following terms are used throughout this document:
_app.gox and .gshvar declaration that is interpreted as class fields rather than
package variablesClass extension extraction operates on the base filename.
For a filename whose last path extension is not .gox, the class extension is the last path extension.
For a filename whose last path extension is .gox, the class extension is:
.gox and is not the first character of
the filename.goxExamples:
Rect.gox has class extension .goxmain_app.gox has class extension _app.goxmain.gsh has class extension .gshA source file is treated as a classfile if and only if one of the following is true:
.gox and its normalized class extension is not recognized by the registryIf the second rule applies, the file is a normal classfile.
If the first rule applies, the file is a framework classfile.
Each recognized class extension belongs to exactly one framework registration.
A recognized file is a project classfile if the registration marks it as a project file for the pair
(class extension, filename). Otherwise it is a work classfile.
A recognized file is classified as follows:
main plus that extension is a project
file and all other files with the same extension are work filesExamples:
_case.gox, main_case.gox is a project file
and foo_case.gox is a work file_app.gox is the project extension and _cmd.gox is the work extension, demo_app.gox is a
project file and list_cmd.gox is a work fileA classfile may omit the package clause.
If a classfile omits the package clause, it is compiled as if it declared package main.
Import declarations follow ordinary XGo source-file rules:
Ignoring comments, a classfile has the following top-level structure:
Classfile = [ PackageClause ] { ImportDecl } { ClassDecl } [ TopLevelStmtList ] .
ClassDecl = ConstDecl | TypeDecl | VarDecl | FuncDecl .
TopLevelStmtList = StatementList .
The optional TopLevelStmtList must be the final top-level construct in the file. After the first top-level statement
is encountered, the remainder of the file is parsed using ordinary function-body statement-list rules as the body of the
shadow entry.
Accordingly, constructs that are valid as statements inside a function body, including local declaration statements such
as var x = 1, become part of the shadow entry rather than remaining top-level declarations.
For the purposes of class lowering, at most one top-level var declaration is interpreted as the field declaration
block.
The field declaration block is identified semantically as follows:
var declaration in the file's declaration listimport, const, or type declarationvar declaration satisfying those conditionsThat declaration does not introduce package variables. Instead, it declares fields of the generated class type.
All other top-level var declarations follow ordinary variable-declaration rules. If they occur before the shadow entry
begins, they are package-level variables. If they occur after the shadow entry begins, they are local declaration
statements inside that entry method.
This specification defines embedded-field syntax only for the field declaration block.
Within the field declaration block, each spec has one of the following forms:
FieldSpec = EmbeddedField [ Tag ]
| IdentifierList [ Type ] [ "=" ExpressionList ] [ Tag ] .
EmbeddedField = [ "*" ] TypeName
| [ "*" ] PackageName "." identifier .
Tag = string_lit .
The following special rule applies only inside the field declaration block:
IdentifierList may appear without either a Type or an = initializerFor any other FieldSpec, ordinary var-declaration validity rules apply. In particular, at least a Type or an =
initializer must be present.
Examples:
var (
Width, Height int
*bytes.Buffer "buffer"
BaseClass
)
Within the field declaration block:
_:"..."For every classfile, the compiler generates one named class type.
The initial class type name is derived from the class file stem.
The class file stem is normalized as follows:
: is removed# is removed- is replaced with _. is replaced with _Examples:
Rect.gox produces the initial type name Rectget_p_#id_app.gox has class file stem get_p_#id and normalized type name get_p_idFramework metadata may further transform the type name:
main, or a framework that has no explicit project file, uses the project
base-class name as its default class type name. A leading * on the base-class name is removed-prefix= is prepended to the normalized work-file steminit, main, go, goto, type, var, import, package, interface, struct, const, func, map,
chan, for, if, else, switch, case, select, defer, range, return, break, continue,
fallthrough, or default, an underscore is prependedThe generated class type must be unique within the package after all such normalization.
The underlying type of every class type is a struct type.
For a normal classfile, the struct fields are the user fields declared in the field declaration block, in source order.
For a framework classfile, framework-added fields precede user fields.
Framework-added fields are inserted in the following order:
-embed flag, in work-class declaration
order and lexicographic source-file path orderUser fields from the field declaration block are appended after all framework-added fields.
It is an error for two generated fields of the same class type to have the same field name.
In a classfile, every top-level function declaration without an explicit receiver is rewritten as a method on the generated class type.
The injected receiver is:
this*T, where T is the generated class typeThis rewriting applies to ordinary function declarations, including a function named init.
As a consequence, func init() inside a classfile declares a method named init. It is not a package initialization
function.
The same rewriting rule also applies to func main(). Inside a classfile, func main() declares a method named main.
It is not a package-level entry function.
A top-level function declaration that already has an explicit receiver is not rewritten by the classfile mechanism.
The classfile parser also accepts the following classfile-only FuncDecl form:
StaticMethodDecl = "func" "." identifier Signature [ FunctionBody ] .
This form declares a static method associated with the generated class type. Its further lowering follows the ordinary XGo static-method machinery.
Top-level statements are not compiled as package-level statements.
Instead, they are wrapped into a synthetic function with an initially empty parameter list, called the shadow entry. After classfile lowering, the shadow entry is renamed as follows:
MainEntry for a project classfileMain for any other classfileIf a framework classfile has no explicit top-level statement sequence, the compiler still synthesizes an empty shadow entry with the same name. This ensures that framework entry methods always exist.
For a normal classfile, no empty shadow entry is synthesized. A normal classfile without top-level statements therefore
has no synthetic Main method.
If a synthetic Main or MainEntry method is created for a framework classfile, and the embedded base class declares a
method with the same name, the synthetic method adopts that base method's parameter list and result list, with the
classfile receiver *T replacing the base receiver.
The generated body forwards the incoming arguments to the embedded base method before executing any user-written top-level statements.
If a synthetic Main or MainEntry method is generated, its body executes in the following order:
this.XGo_Init() if the generated class type directly defines XGo_InitThis order is fixed.
If the adopted method signature has result parameters, the resulting body is checked under ordinary Go control-flow rules for that signature after lowering.
XGo_Init generationIf at least one field in the field declaration block has an initializer, the compiler generates a method:
func (this *T) XGo_Init() *T
The generated method assigns all field initializers in field-declaration order and then returns this.
If the field declaration block contains no initializers, no XGo_Init method is generated.
XGo_Init does not doField initializers are not attached to the struct type itself.
In particular, field initializers are not executed automatically by:
new(T)&T{...}T{...}MainMainEntryIf a class type has an XGo_Init method but no synthetic entry method invokes it, the method exists but is never called
automatically.
Within a class method body, unqualified identifier resolution differs from ordinary package-level XGo code.
A bare identifier is resolved in the following order:
thisTwo additional rules apply:
Implicit framework-package export lookup uses the ordinary XGo package-member alias rules. In particular, an exported function may be referenced through its lowercase alias.
The injected receiver name this is an ordinary method receiver identifier and may be referenced explicitly.
The classfile registry is built from three sources:
gox.modgo.mod require lines carrying the comment //xgo:classThe toolchain defines the following built-in framework registrations and no others:
project .gsh App github.com/qiniu/x/gsh math
project _test.gox App github.com/goplus/xgo/test testing
class _test.gox Case
A built-in registration is active without any module declaration.
Built-in registrations participate in file classification, lowering, and package assembly exactly as if they were loaded from module metadata.
The classfile loader recognizes the following module directives:
ProjectDirective = "project" [ ProjectExt ExportedName ] PackagePath { PackagePath } .
ClassDirective = "class" { ClassDirectiveFlag } WorkExt ExportedName [ ExportedName ] .
ImportDirective = "import" [ ImportName ] PackagePath .
ClassDirectiveFlag = "-embed" | "-prefix=" string_without_space .
Every class or import directive belongs to the most recent preceding project directive.
A project group consists of one project directive together with all class and import directives that belong to it.
The first package path of a project directive is the framework package used to resolve any base-class symbols named by
that project group.
Any additional package paths participate in implicit framework-package export lookup but are not searched for base-class symbols.
The extension token accepted by project and class directives has two forms:
_[class].gox.[class]Both forms are part of the classfile mechanism. Neither form is a compatibility alias of the other.
For newly defined framework registrations, _[class].gox is the recommended form.
This recommendation does not constrain the built-in registrations defined above.
This specification defines file classification and compilation semantics for both forms. It does not require auxiliary
tools to recognize arbitrary non-.gox class extensions automatically.
The textual extension token accepted by project and class directives may be written with a leading * and, for
projects, may also be written with a leading main.
The loader normalizes those forms by stripping the leading * or leading main and storing only the resulting class
extension for matching.
Examples:
*_cmd.gox normalizes to _cmd.goxmain_app.gox normalizes to _app.goxFor each project directive:
ProjectExt and ExportedName are present, the directive defines a project file kind and names the project base
classProjectExt and ExportedName are omitted, the directive defines no project file kind or project base class and
still defines the package lookup set and the project group to which subsequent class and import directives belong*, in which case the generated project class embeds a
pointer to the named base type rather than the base type itselfFor each class directive:
-prefix= prepends the given string to every generated work class type name-embed causes the project class type to embed a field for each generated work class instance of that work kindFor each import directive:
import directives in the same project group resolve to the same auto-import name, the last directive
winsA test framework registration is a framework registration whose project extension has the suffix test.gox.
All other framework registrations are non-test framework registrations.
For each framework registration, a package may contain at most one explicit project classfile.
It is an error for a package to contain more than one explicit project classfile for the same framework registration.
If a framework registration provides a project base class but the package contains no explicit project file for that framework, the compiler still synthesizes a default project class type.
The synthesized project class has no source file of its own. Its type name is derived by the project type-naming rules described earlier.
MainFor every non-test framework registration that provides a project base class, the compiler generates a project method
named Main on the project class type. The project base class is therefore required to provide a method named Main.
The generated project method constructs work-class instances and forwards them to the embedded project base-class method
Main.
The grouping rule is:
Main parameter is variadic, all work files of that kind
are passed as variadic argumentsMain parameter orderThe project Main method constructs one fresh work-class instance for each work file in the package. When -embed is
present on a work class declaration, the freshly created work instance is also assigned into the corresponding embedded
field on the project instance before the project Main call.
The compiler may synthesize additional work-class methods when the declared work prototype requires them.
ClassfnameIf the work prototype contains a method named Classfname, the compiler generates:
func (this *T) Classfname() string
The method returns the class file stem.
Examples:
hello_tool.gox yields helloget_p_#id_app.gox yields get_p_#idClasscloneIf the work prototype contains a method named Classclone, the compiler generates a shallow-clone method named
Classclone with no parameters other than the receiver. Its result list is adopted from the prototype's Classclone
declaration.
The generated implementation copies *this by value into a temporary variable and returns the address of that temporary
value.
main synthesisWhen compiling a package named main, the compiler checks whether an explicit package-level main function already
exists.
If one exists, no class-based package main is synthesized.
If none exists, the compiler selects a class entrypoint as follows:
If this process selects a class type T, the compiler generates:
func main() { new(T).Main() }
If no class type is selected, the compiler generates an empty main function unless automatic main generation is
disabled in compiler configuration.
The following compatibility aliases are also accepted:
gop.mod is a legacy alias of gox.mod and is read only when gox.mod is absent//gop:class is a legacy alias of //xgo:class on dependency go.mod require linesAfter lowering, a class type is an ordinary named struct type and its lowered methods are ordinary Go methods or ordinary static-method helpers.
Accordingly:
The classfile mechanism therefore adds a source-level lowering rule. It does not add a new runtime object model beyond what is produced by the lowered Go code.